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Description 

FIELD OF TECHNOLOGY 

5 The present invention relates to a novel DNA and a process tor preparing a protein which possesses an activity to 
inhibit osteoclast differentiation and/or maturation (hereinafter called osteoclastogenesis-inhibitory activity) by a genetic 
engineering technique using the DNA. More particularly, the present invention relates to a genomic DNA encoding a 
protein OCIF which possesses an osteodastogenesis-inhibitory activity and a process for preparing said protein by a 
genetic engineering technique using the genomic DNA. 

10 

BACKGROUND QF TH^ INVENTION 

Human bones are constantly repeating a process of resorption and formation. Osteoblasts controlling formation of 
bones and osteoclasts controlling resorption of bones take major roles in this process. Osteoporosis is a typical disease 

15 caused by abnormal metabolism of bones. This disease is caused when bone resorption by osteoclasts exceeds bone 
formation by osteoblasts. Although the mechanism of this disease Is still to be elucidated completely, tiie disease 
causes the bones to ache, makes the bones fragile, and may results in fracturing of the bones. As the population of the 
aged increases, this disease results in an increase in bedridden aged people which becomes a social problem. Urgent 
development of a therapeutic agent for this disease is strongly desired. Disease due to a decrease in bone mass is 

20 expected to be treated by controlling bone resorption, accelerating bone formation, or improving balance between bone 
resorption and formation. 

Osteogenesis is expected to increase by accelerating proliferation, differentiation, or activation of the cells control- 
ling bone formation, or by controlling proliferation , differentiation, or activation of the cells involved in bone resorption. 
In recent years, strong interest has been directed to physiologically active proteins (cytokines) exhibiting such activities 

25 as described above, and energetic research is ongoing on this subject. TTie cytokines which have been reported to 
accelerate proliferation or differentiation of osteoblasts include tiie proteins of fibroblast growtii factor family (FGF: 
Rodan S. B. etal., Endoainology vol. 121, pl917, 1987), insulin-like growth factor I (IGF-I: Hock J. M. etal.. Endocrinol- 
ogy vol. 122. p 254, 1988), insulin growth factor II (IGF-II: McCartiiy T et al., Endocrinology vol. 124, p 301, 1989). 
Activin A (Centrella M. et al., Mol. Cell. Biol., vol. 1 1, p 250, 1991), transforming growth factor-p, (Noda M., The Bone, 

30 vol. 2, p 29. 1988), Vasculotropin (Varonique M. et al., Biochem. Biophys, Res. Commun.. vol. 199, p 380, 1994), and 
the protein of heterotopic bone formation factor family (bone morphogenic protein; BMP: BMP-2; Yanaguchi A. et al., J. 
Cell Biol. vol. 113, p 682, 1991. OP-1 ; Sampath T K. et al., J. Biol. Chem. vol. 267, p 20532. 1992, and Knutsen R. et 
al., Biochem. Biophys. Res. Commun. vol. 194, P 1352. 1993). 

On the otiier hand, as the cytokines which suppress differentiation and/or maturation of osteoclasts, transforming 

35 growth factor-p (Chenu C. et. al.. Proc. Natl. Acad. Sci. USA, vol. 85, p 5683, 1988), interleukin-4 (Kasano K. et al., 
Bone-Miner, vol. 21, p 179, 1993). and the like have been reported. Further, as the cytokines which suppress bone 
resorption by osteoclast, calcitonin (Bone-Miner., vol. 17. p 347, 1992 ), macrophage colony stimulating factor (Hatters- 
ley G. et al., J. Cell. Physiol, vol. 137. p 199. 1988). interleukin-4 (Watanabe. K. et al., Biochem. Biophys. Res. Com- 
mun. vol. 172. P 1035, 1990), and interferon-y (Gowen M. et al., J. Bone Miner. Res., vol. I. p 46.9. 1986) have been 

40 reported. 

These cytokines are expected to be used as agents for treating diseases accompanying bone loss by accelerating 
bone formation or suppressing of bone resorption. Clinical tests are being undertaken to verify the effect of improving 
bone metabolism of some cytokines such as insulin-like growth factor-l and the heterotopic bone formation factor family 
In addition, calcitonin is already commercially available as a therapeutic agent for osteoporosis and a pain relief agent. 

45 At present, drugs for clinically treating bone diseases or shortening the period of treatment of bone diseases include 
activated vitamin D3, calcitonin and its derivatives, and hormone preparations such as estradiol agent, ipriflavon or cal- 
cium preparations. These agents are not necessarily satisfactory in terms of tiie efficacy and therapeutic results. Devel- 
opment of a novel therapeutic agent which can be used in place of these agents is strongly desired. 

In view of this situation, tiie present inventors have undertaken extensive studies. As a result, the present inventors 

50 had found protein OCIF exhibiting an osteoclastogenesis-inhibitory activity in a culture broth of human embryonic lung 
fibroblast IMR-90 (ATCC Deposition No. CCL186), and filed a patent application (PCT/JP96/00374). The present inven- 
tors have conducted further studies relating to the origin of this protein OCIF exhibiting the osteoclastogenesis-inhibi- 
tory activity The studies have matured into determination of the sequence of a genomic DNA encoding the human 
origin OCIR Accordingly, an object of the present invention is to provide a genomic DNA encoding protein OCIF exhib- 

55 iting osteoclastogenesis-inhibitory activity and a process for preparing this protein by a genetic engineering technique 
using the genomic DNA. 
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DISCLOSURE OF THE INVENTION 

Specifically, the present Invention relates to a genomic DNA encoding protein OCIF exhibiting osteoclastogenesis- 
inhibitory activity and a process for preparing this protein by a genetic engineering technique using the genomic DNA. 
5 The DNA of the present Invention includes the nucleotide sequences No. 1 and No. 2 in the Sequence Table attached 
hereto. 

Moreover, the present invention relates to a process for preparing a protein, comprising inserting a DNA including 
the nucleotide sequences of the sequences No. 1 and No. 2 in the Sequence Table into an expression vector, producing 
a vector capable of expressing a protein having the following physicochemical characteristics and exhibiting the activity 
10 of inhibiting differentiation and/or maturation of osteoclasts, and producing this protein by a genetic engineering tech- 
nique, 

(a) molecular weight (SDS-PAGE): 

15 (i) Under reducing conditions: about 60 kD, 

(ii) Under non-reducing conditions: about 60 KD and about 120 kD; 

(b) amino acid sequence: 

includes an amino acid sequence of the Sequence ID No. 3 of the Sequence Table, 
20 (c) affinity: 

exhibits affinity to a cation exchanger and heparin, and 
(d) thermal stability: 

(i) the osteoclast differentiation and/or maturation inhibitory activity is reduced when treated with heat at 70^C 
25 for 10 minutes or at 56°C for 30 minutes, 

(ii) the osteoclast differentiation and/or maturation inhibitory activity is lost when treated with heat at 90*'C for 
10 minutes. 

The protein obtained by expressing the gene of the present invention exhibits an osteoclastogenesis-inhibitory 
30 activity. This protein is effective as an agent for the treatment and improvement of diseases involving deaease in the 
amount of bone such as osteoporosis, diseases relating to bone metabolism abnormality such as rheumatism, degen- 
erative joint disease, or multiple myeloma, and is useful as an antigen to establish an immunological diagnosis of such 
diseases. 

35 BRIEF PESCRIPTIQN QF THE PRAWINGS 

Figure 1 shows a result of Western Blotting analysis of the protein obtained by causing genomic DNA of the present 
invention to express a protein in Example 4 (iii). wherein lane 1 indicates a marker, lane 2 indicates the culture broth of 
C0S7 cells in which a vector pWESRaOCIF (Example 4 (iii))has been transfected, and lane 3 is the culture broth of 
40 C0S7 cell in which a vector pWESRa(controI) has been ti'ansfected. 

BEST MODE FOR CARRYING OUT THE INVENTION 

The genomic DNA encoding the protein OCIF which exhibits osteoclastogenesis-inhibitory activity in the present 
45 invention can be obtained by preparing a cosmid library using a human placenta genomic DNA and a cosmid vector 
and by screening this library using DNA fragments which are prepared based on the OCIF cDNA as a probe. The thus- 
obtained genomic DNA is inserted into a suitable expression vector to prepare an OCIF expression cosmid. A recom- 
binant type OCIF can be obtained by ti^ansfecting tiie genomic DNA into a host organism such as various types of cells 
or microorganism strains and causing the DNA to express a protein by a conventional method. The resultant protein 
50 exhibiting osteoclastogenesis-inhibitory activity (an osteoclastogenesis-inhibitory factor) is useful as an agent for tiie 
treatment and improvement of diseases involving a decrease in bone mass such as osteoporosis and other diseases 
relating to bone metabolism abnormality and also as an antigen to prepare antibodies for establishing immunological 
diagnosis of such diseases. The protein of the present invention can be prepared as a drug composition for oral or non- 
oral administration. Specifically, the drug composition of the present invention containing the protein which is an osteo- 
55 clastogenesis-inhibitory factor as an active ingredient can be safely administered to humans and animals. As the form 
of drug composition, a composition for injection, composition for intravenous drip, suppository, nasal agent, sublingual 
agent, percutaneous absorption agent, and tiie like are given. In tiie case of the composition for injection, such a com- 
position is a mixture of a pharmacologically effective amount of osteoclastogenesis-inhibitory factor of the present 
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invention and a pharmaceutically acceptable carrier. The composition may further comprise amino acids, saccharides, 
cellulose derivatives, and other excipients and/or activation agents, including other organic compounds and inorganic 
compounds which are commonly added to a composition for injection. When an injection preparation is prepared using 
the osteodastogenesis-inhibitory factor of the present invention and these excipients and activation agents, a pH 
5 adjuster, buffering agent, stabilizer, solubilizing agent, and the like may be added if necessary to prepare various types 
of injection agents. 

The present invention will now be described in more detail by way of examples which are given for the purpose of 
illustration and not intended to be limiting of the present invention. 

10 Example 1 

( Preparation of a cosmid library ) 

A cosmid library was prepared using human placenta genomic DNA (Clonetech; Cat. No. 6550-2) and pWE15 cos- 
15 mid vector (Stratagene). The experiment was carried out following principally the protocol attached to the pWE15 cos- 
mid vector kit of Stratagene Company, provided Molecular Cloning: A Laboratory Mannual (Cold Spring Harbor 
Laboratory (1989)) was refen-ed to for common procedures for handling DNA, E. coli. and pharge. 

(i) Preparation of restrictive enzvmolvsate of human-oenomic DNA 

20 

Human placenta genomic DNA dissolved in 750 >il of a solution containing 10 mM Tris-HCI, 1 0 mM MgCl2, and 100 
mM NaCI was added to four 1,5 ml Eppendorf tubes (tube A, B, C. and D) in the amount of 100 |ig each. Restriction 
enzyme Mbol was added to these tubes in the amounts of 0.2 unit for tube A, 0.4 unit for tube B. 0.6 unit for tube C, and 
0.8 unit for tube D, and DNA was digested for 1 hour. Then, EDTA in the amount to make a 20 mM concentration was 
25 added to each tube to terminate the reaction, followed by extraction with phenol/chloroform (1 :1). A two-fold amount of 
ethanol was added to the aqueous layer to precipitate DNA. DNA was collected by centrif ugation, washed with 70% eth- 
anol, and DNA in each tube was dissolved in 1 00 \i\ of TE (1 0 mM HCI (pH 8.0) + 1 mM EDTA buffer solution, hereinafter 
called TE). DNA in four tubes was combined in one tube and incubated for 10 minutes at 68''C. After cooling to room 
temperature, the mixture was overlayed onto a 10%-40 % linear sucrose gradient which was prepared in a buffer con- 
so taining 20 mM Tris-HCI (pH 8.0). 5 mM EDTA, and 1 mM NaCI in an centrifugal tube (38 ml). The tube was centrifuged 
at 26.000 rpm for 24 hours at 20''C using a rotor SRP28SA manufactured by Hitachi. Ltd. and 0.4 ml fractions of the 
sucrose gradient was collected using a fraction collector. A portion of each fraction was subjected to 0.4% agarose elec- 
trophoresis to confirm the size of DNA. Fractions containing DNA with a length of 30 kb (kilo base pair) to 40 kb were 
thus combined. The DNA solution was diluted with TE to make a sucrose concentration to 1 0% or less and 2.5-fold vol- 
35 umes of ethanol was added to precipitate DNA. DNA was dissolved in TE and stored at 4°C. 

(ii) Preparation of posmid vQctor 

The pWE15 cosmid vector obtained from Stratagene Company was completely digested with restriction enzyme 
40 BamHI according to the protocol attached to the cosmid vector kit. DNA collected by ethanol precipitation was dissolved 
in TE to a concentration of 1 mg/ml . Phosphoric add at the 5'-end of this DNA was removed using calf small intestine 
alkaline phosphatase, and DNA was collected by phenol extraction and ethanol precipitation. The DNA was dissolved 
in TE to a concentration of 1 mg/ml. 

45 (iii) Ligation of genomic DNA to vector and in vitro packaging 

1.5 micrograms of genomic DNA fractionated according to size and 3 ^ig of pWE15 cosmid vector which was 
digested with restriction enzyme BamHI were ligated in 20 ^1 of a reaction solution using Ready-To-Go T4DNA iigase 
of Pharmacia Company The ligated DNA was packaged in vitro using Gigapack™ II packaging extract (Stratagene) 
50 according to tiie protocol. After the packaging reaction, a portion of the reaction mixture was diluted stepwise with an 
SM buffer solution and mixed witii E. coli XLI-Blue MR (Stratagene) which was suspended in 10 mM MgCl2 to cause 
pharge to infect, and plated onto LB agar plates containing 50 ^g/ml of ampicillin. The number of colonies produced was 
counted. The number of colonies per 1 fxl of packaging reaction was calculated based on this result. 

55 (iv) Preparation of a cosmid library 

The packaging reaction solution tiius prepared was mixed with E. coli XLI-Blue MR and the mixture was plated 
onto agarose plates containing ampicillin so as to produce 50,000 colonies per agarose plate having a 15 cm of diam- 
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eter. After incubating the plate overnight at Zl^'C, an LB culture medium was added in the amount of 3 ml per plate to 
suspend and collect colonies of E. coli. Each agarose plate was again washed with 3 ml of the LB culture medium and 
the washing was combined with the original suspension of E. coli. The E. coli collected from all agarose plates was 
placed in a centrifugal tube, glycerol was added to a concentration of 20%, and ampicillin was further added to make a 
5 final concentration of 50 ^g/m1. A portion of the E. coli suspension was renrK)ved and the remainder was stored at - 
80''C. The removed E. coli was diluted stepwise and plated onto an agar plates to count the number of colonies per 1 
ml of suspension. 

Example 2 

10 

(Screening of cosmid li brary and puri ficat ion of colo ny ) 

A nitrocellulose filter (Millipore) with a diameter of 14.2 cm was placed on each LB agarose plate with a diameter 
of 1 5 cm which contained 50 ^g/ml of ampicillin. The cosmid library was plated onto the plates so as to produce 50,000 

15 colonies of E. coli per plate, followed by incubation overnight at ^yc. E. coli on the nitrocellulose filter was transferred 
to another nitrocellulose filter according to a conventional method to obtain two replica filters. According to the protocol 
attached to the cosmid vector kit. cosmid DNA in the E. coli on the replica filters was denatured with an alkali, neutral* 
ized. and immobilized on the nitrocellulose filter using a Stratalinker (Stratagene). The filters were heated for two hours 
at 80°C in a vacuum oven. The nitrocellulose filters thus obtained were hybridized using two kinds of DNA produced, 

20 respectively, from 5'-end and 3'-end of human OCIF cDNA as probes. Namely, a plasmid was purified from E. coli 
pKB/OIF10 (deposited at Tlie Ministry of International Trade and Industry, the Agency of Industrial Science and Tech- 
nology, Biotechnology Laboratory. Deposition No. PERM BP-5267) containing OCIF cDNA. The plasmid containing 
OCIF cDNA was digested with restriction enzymes Kpnl and EcoRI. Fragments thus obtained was separated using 
agarose gel electrophoresis. KpnI/EcoRI fragment with a length of 0.2 kb was purified using a QIAEX II gel extraction 

25 kit (Qiagen). This DNA was labeled with using the Megaprime DNA Labeling System (Amasham) (5'-DNA probe). 
Apart from this, a BamHI/EcoRV fragment with a length of 0.2 kb which was produced from the above plasmid by diges- 
tion with restriction enzymes BamHI and EcoRV was purified and labeled with ^^p (3'-DNA probe). One of the replica 
filters described above was hybridized with the 5*-DNA probe and the other with the 3'-DNA probe. Hybridization and 
washing of the filters were carried out according to the protocol attached to the cosmid vector kit. Autoradiography 

30 detected several positive signals with each probe. One colony which gave positive signals with Mh probe was identi- 
fied. Tlie colony on the agar plate, which corresponding to the signal on the autoradiogram was isolated and purified. 
A cosmid was prepared from the purified colony by a conventional method. This cosmid was named pWEOCIR The 
size of human genomic DNA contained in this cosmid was about 38 kb. 

35 Example 3 

( Determination of the nucleotide sequence of human OCIF genomic DNA ) 

(i) Subdonino of OCIF Genomic DNA 

40 

Cosmid pWEOCIF was digested with restriction enzyme EcoRI. After the separation of the DNA fragments thus 
produced by electrophoresis using a 0.7% agarose gel, the DNA fragments were transferred to a nylon membrane 
(Hytx)nd -N, Amasham) by the Southern blot technique and immobilized on the nylon membrane using Stratalinker 
(Stratagene). On the other hand, plasmid pBKOCIF was digested with restriction enzyme EcoRI and a 1 .6 kb fragment 
45 containing human OCIF cDNA was isolated by agarose gel electrophoresis. The fragment was labeled with ^^P using 
the Megaprime DNA labeling system (Amasham). 

Hybridization of the nylon membranes described above with the ^^P-labeled 1.6-kb OCIF cDNA was performed 
according to a conventional method detected that DNA fragments with a size of 6 kb, 4 kb, 3.6 kb. and 2.6 kb. These 
fragments hybridized with the human OCIF cDNA were isolated using agarose gel electrophoresis and individually sub- 
so cloned into an EcoRI site of pBluescript II SK + vector (Stratagene) by a conventional method. The resulting plasmids 
were respectively named pBSE 6. pBSE 4, pBSE 3.6. and PBSE 2.6. 

(ii) De tQrmin^ ti pn Qf thQ ngplQQticjQ sequence 

55 The nucleotide sequence of human OCIF genomic DNA which was subcloned into the plasmid was determined 
using the ABI Dideoxy Terminator Cycle Sequencing Ready Reaction kit (Perkin Elmer) and the 373 Sequencing Sys- 
tem (Applied Biosystems). The primer used for the determination of the nucleotide sequence was synthesized based 
on the nucleotide sequence of human OCIF cDNA (Sequence ID No. 4 in the Sequence Table). The nucleotide 
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sequences thus determined are given as the Sequences No. 1 and No. 2 in the Sequence Table. The Sequence ID No. 
1 includes the first exon of the OCIF gene and the Sequence ID No. 2 includes the second, third, fourth, and fifth exons. 
A stretch of about 1 7 kb is present between the first and second exons. 

5 Example 4 

( Production of recombinant OCIF using COS-7 cells) 
(I) Preparation of OCIF genomic DNA expression cosmid 

10 

To express OCIF genomic DNA in animal cells, an expression unit of expression plasmid pcDL-SRa296 (Molecular 
and Cellar Biology, vol. 8. P466-472, 1 988) was inserted into cosmid vector pWEI 5 (Stratagene). Rrst of all. the expres- 
sion plasmid pcDL-SRa296 was digested with a restriction enzyme Sal I to cut out expression unit with a length of about 
1 .7 kb which includes an SRopromotor, SV40 later splice signal, poly (A) addition signal, and so on. The digestion prod- 

15 ucts were separated by agarose electrophoresis and the 1.7-kb fragment was purified using the QIAEX II gel extraction 
kit (Qiagen). On the other hand, cosmid vector pWEIS was digested with a restriction enzyme EcoRI and fragments 
were separated using agarose gel electrophoresis. pWE15 DNA of 8.2 kb long was purified using the QIAEX II gel 
extraction kit (Qiagen). The ends of these two DNA fragments were bluntled using a DNA blunting kit (Takara Shuzo). 
ligated using a DNA ligation kit (Takara Shuzo). and transferred into E. coll DH5a (Gibco BRL). The resultant transfbrm- 

20 ant was grown and the expression cosmid pWESRa containing an expression unit was purified using a Qiagen column 
(Qiagen). 

The cosmid pWE OCIF containing the OCIF genomic DNA with a length of about 38 kb obtained in (i) above was 
digested with a restriction enzyme NotI to cut out the OCIF genomic DNA of about 38 kb. After separation by agarose 
gel electrophoresis, the DNA was purified using the QIAEX II gel extraction kit (Qiagen). On the other hand, the expres- 

25 sion cosmid pWESRa was digested with a restriction enzyme EcoRI and the digestion product was extracted with phe- 
nol and chloroform, ethanol-precipitated. and dissolved in TE. 

pWESRa digested with a restriction enzyme EcoRI and an EcoRI-Xmnl-NotI adapter (#1 105. #11 56 New England 
Biolaboratory Co.) were ligated using T4 DNA ligase (Takara Shuzo Co., Ltd.). After removal of the free adapter by aga- 
rose gel electrophoresis, the product was purified using QIAEX gel extraction kit (Qiagen). The OCIF genomic DNA with 

30 a length of about 37 kb which was derived from the digestion with restriction enzyme NotI and the pWESRa to which 
the adapter was attached were ligated using T4 DNA ligase (Takara Shuzo). The DNA was packaged in vitro using the 
Gigapack packaging extract (Stratagene) and infected with E. coll XL1 -Blue MR (Stratagene). The resultant transform- 
ant was grown and the expression cosmid pWESRaOCIF which contained OCIF genomic DNA was inserted was puri- 
fied using a Qiagen column (Qiagen). The OCIF expression cosmid pWESRaOCIF was ethanol-precipitated and 

35 dissolved in sterile distilled water and used in the following analysis. 

(ii) Transient expression of OCIF genomic DNA and measurement of OCIF activity 

A recombinant OCIF was expressed as described below using the OCIF expression cosmid pWESRaOCIF 

40 obtained in (I) above and its activity was measured. COS-7 (8x1 O^cel Is/well) cells (Riken Cell Bank, RCB0539) were 
planted in a 6-well plate using DMEM culture medium (Gibco BRL) containing 10% fetal bovine serum (Gibco BRL). On 
the following day. the culture medium was removed and cells were washed with serum-free DMEM culture medium. The 
OCIF expression cosmid pWESRaOCIF which had been diluted witii OPTI-MEM culture medium (Gibco BRL) was 
mixed with lipophectamine and the mixture was added to the cells in each well according to the attached protocol. The 

45 expression cosmid pWESRa was added to the cells in the same manner as a control. The amount of the cosmid DNA 
and Lipophectamine was respectively 3 ^g and 12 ^1. After 24 hours, the culture medium was removed and 1.5 ml of 
fresh EX-CELL 301 culture medium (JRH Bioscience) was added to each well. The culture medium was recovered after 
48 hours and used as a sample for the measurement of OCIF activity. The measurement of OCIF activity was carried 
out according to the method described by Kumegawa. M. et al. (Protein. Nucleic Acid, and Enzyme, Vol. 34, p 999 

50 (1989)) and the method of TAKAHASHI, N. et al. (Endocrlhology vol. 122. p 1373 (1988)). The osteoclast formation in 
the presence of activated vitamin D3 from bone marrow cells isolated from mice aged about 17 days was evaluated by 
the induction of tartaric acid resistant acidic phosphatase activity. The inihibition of the acid phosphatase was measured 
and used as the activity of the protein which possesses osteoclastogenesis-inhibitory activity (OCIF). Namely. 100 
nl/well of a OCIF sample which was diluted with a-MEM culture medium (Gibco BRL) containing 2x10'^ M activated 

55 vitamin D3 and 10% fetal bovine serum was added to each well of a 96 well micro plate. Then, 3x10^ bone marrow cells 
Isolated from mice (about 17<lays old) suspended In 100 \i\ of a-MEM culture medium containing 10% fetal bovine 
serum were added to each well of the 96 well micro plate and cultured for a week at 37*C and 1 00% humidity under 5% 
CO2 atmosphere. On days 3 and 5. 1 60 |il of the conditioned medium was removed from each well, and 1 60 ^1 of a sam- 
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pie which was diluted with a-MEM culture medium containing 1x10'^ M activated vitamin D3 and 10% fetal bovine 
serum was added. After 7 days from the start of culturing, the cells were washed with a phosphate buffered saline and 
fixed with a ethanol/acetone (1:1) solution for one minute at room tennperature. The osteoclast formation was detected 
by staining the cells using an acidic phosphatase activity measurement kit (Acid Phosphatase. Leucocyte. Cat.No. 387- 
5 A, Sigma Company). A decrease in the number of cells positive to acidic phosphatase activity in the presence of tartaric 
acid was taken as the OCIF activity. The results are shown in Table 1, which Indicates that the conditioned medium 
exhibits the similar activity to natural type OCIF obtained from the IMR-90 culture medium and recombinant OCIF pro- 
duced by CHO cells. 

10 

TABLE 1 



Activity of OCIF expressed by COS-7 cells in the conditioned medium 


Dilution 


1/10 


1/20 


1/40 


1/80 


1/160 


1/320 


OCIF genomic DNA introduced 
Vector introduced 
Untreated 


++ 


++ 


++ 


++ 


+ 




*•++" indicates an activity inhibiting 80% or more of osteoclasi 
of osteoclast formation, and indicates that no inhibition of 


formation. indicates an activity inhib 
osteoclast formation is observed. 


rting 30-80% 



(iii) Mentrfication Qf the product b y We st ern Blo tti ng 

25 A buffer solution (10 ^1) for SDS-PAGE (0.5 M Tris-HCI. 20% glycerol, 4% SDS, 20 ng/m1 bromophenol blue, pH 
6.8), was added to 10 fi1 of the sample for the measurement of OCIF activity prepared in (ii) above. After boiling for 3 
minutes at 100^'C, the mixture was subjected to 10% SDS polyacrylamide electrophoresis under non-reducing condi- 
tions. The proteins were transferred from the gel to a PVDF membrane (ProBlott, Perkin Elmer) using semi-dry blotting 
apparatus (Biorad). The membrane was blocked and incubated for 2 hours at 37'*C together with a horseradish perox- 

30 idase-tabeled anti-OCIF antibody obtained by labeling the previously obtained OCIF protein with horseradish peroxi- 
dase according to a conventional method. After washing, the protein which has bound the anti-OCIF antibody was 
detected using the ECL system (Amasham). As shown in Figure 1, two bands, one with a molecular weight of about 
120 kilo dalton and the other 60 kilo dalton, were detected in the supernatant obtained from the culture broth of COS- 
7 cells in which pWESRaOCIF was transfecled. On the other hand, these two bands with a molecular weight of about 

35 120 kilo dalton and 60 kilo dalton were not detected in the supernatant obtained from the culture broth of COS-7 cells 
in which pWESRovector was transfected, confirming that the protein obtained was OCIF 

INDUSTRIAL APPLICABILITY 

40 The present invention provides a genomic DNA encoding a protein OCIF which possesses an osteoclastogenesis- 
inhibitory activity and a process for preparing this protein by a genetic engineering technique using the genomic DNA. 
TTie protein obtained by expressing the gene of the present invention exhibits an osteoclastogenesis-inhibitory activity 
and is useful as an agent for the treatment and improvement of diseases involving a decrease in the amount of bone 
such as osteoporosis, other diseases resulting from bone metabolism abnormality such as rheumatism or degenerative 

45 joint disease, and multiple myeloma. The protein is further useful as an antigen to establish antibodies useful for an 
immunological diagnosis of such diseases. 

NOTE ON MICROORGANISM 

50 Depositing Organization: 

The Ministry of International Trade and Industry. National Institute of Bioscience and 
Human Technology, Agency of Industrial Science and Technology 
Address: 1-3, Higashi-1-Chome. Tsukuba-shi, Ibaraki-ken. Japan 

Date of Deposition: June 21 , 1995 (originally deposited on June 21 . 1995 and transferred to the international 
55 deposition according to the Budapest Treaty on October 25, 1995) 

Accession No. PERM BP-5267 
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TABLE OF SEQUENCF:?; 

Sequence number: 1 
Length of sequence: 1316 
Sequence Type: nucleic acid 
Strandedness : double 
Topology: linear 

Molecular type: genomic DNA (human OCIF genomic DNA-1) 



Sequence : 



CTGGAGACAT 


ATAACnOAA 


CACTTGGCCC 


TGATGGGGAA 


GCAGCTCTGC 


AGGGACTTTT 


60 


TCACCCATCT 


GTAAACAATT 


TCAGTGCCAA 


CCCGCCAACT 


GTAATCCATG 


AATGGGACCA 


120 


CACTTTACAA 


GTCATCAAGT 


CTAACTTCTA 


GACCAGGGAA 


TTAATGCGGG 


AGACAGCGAA 


180 


CCCTACAGCA 


AAGTCCCAAA 


CnCTCTCGA 


TAGCTTGAGG 


CTAGTGCAAA 


GACCTCGAGG 


240 


AGCCTACTCC 


AGAACnCAG 


CGCGTAGGAA 


GCTCCGATAC 


CAATAGCCCT 


TTGATGATGG 


300 


TGGGGTTGGT 


GAACGGAACA 


GTGCTCCGCA 


AGGTTATCCC 


TGCCCCAGGC 


AGTCCAATTT 


360 


TCACTCTGCA 


GATTCTCICT 


CCCTCTAACT 


ACCCCAGATA 


ACAAGGAGTG 


AATGCAGAAT 


420 


AGCACCGGCT 


TTAGGGCCAA 


TCAGACAHA 


CTTAGAAAAA 


nCCTACTAC 


ATGCTITATG 


480 


TAAACrrCAA 


GATGAATGAT 


TCCGAACTCC 


CCGAAAACGG 


CTCACACAAT 


CCCATGCATA 


540 


AACAGGGCCC 


CTGTAATTTG 


AGCmCAGA 


ACCCGAAGTG 


AAGGGGTCAC 


GCAGCCGGCT 


600 


ACGGCGGAAA 


CTCACACOT 


TCGCCCAGCG 


AGAGGACAAA 


CGTCTGGGAC 


ACACTCCAAC 


660 


TGCGTCCGGA 


TCTTGGCTCC 


ATCGGACTCT 


CAGGGTGGAG 


GAGACACAAG 


CACACCAGCT 


720 


GCCCAGCGTG 


TCCCCAGCCC 


TCCCACCGCT 


GGTCCCCGCT 


GCCAGGAGGC 


TGGCCGCTGC 


780 


CGGGAAGGGG 


CCGGGAAACC 


TCAGAGCCCC 


GCGGAGACAG 


CAGCCGCCTT 


GTTCCTCACC 


840 


CCGGTGGCTT 


TTmrcccc 


TGCTCTCCCA 


GGGGACAGAC 


ACCACCGCCC 


CACCCCTCAC 


900 


GCCCCACCTC 


CCTGGGGGAT 


CCTTTCCCCC 


CCAGCCCTCA 


AAGCGHAAT 


CCTCGAGCTT 


960 


TCTGCACACC 


CCCCGACCGC 


TCCCGCCCAA 


CCTTCCTAAA 


AAAGAAAGGT 


CCAAAGTnC 1020 


GTCCAGCATA 


GAAAAATCAC 


TGATCAAAGG 


CACGCGATAC 


TTCCTCTTGC 


CGGGACGCTA 1080 


TATATAACCT 


GATGAGCGCA 


CGGGCTGCGC 


AGACGCACCG 


GAGCGCTCGC 


CCACCCGCCG 1140 
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CCTCCAACCC CCTCAGGTn CCGCCGACCA CA ATC AAC AAC TTC CTC TCC TCC 1193 

Met ASQ Lys leu Leu Cys Cys 
-20 -15 

GCG CTC GTG GTAACTCCCT GCGCCAGCCC ACGGCTGCCC GCCGCCTGCG 1242 
Ala Leu Val 

GAGGCTGCTG CCACCTGCTC TCCCAACCTC CCAGCCGACC GGCGGGGAGA AGGCTCCACT 1302 
CCCTCCCTCC CAGG 1316 

Sequence number: 2 
Length of sequence: 9898 
Sequence Type: nucleic acid 
Strandedness : double 
Topology: linear 

Molecular type: genomic DNA (human OCIF genomic DNA-2) 
Sequence : 

GCTTACmG TCCCAAATCT CAnAGCCTT AACCTAATAC AGGACTTTGA GTCAAATCAT 60 
ACTCTTGCAC ATAACAACAA ACCTATTTTC ATGCTAAGAT GATGCCACTG TGTTCCTTTC 120 
TCCnCTAC TTT CTC CAC ATC TCC ATT AAG TGC ACC ACC CAC CAA ACC TTT 171 
Phe Leu Asp lie Scr He Lys Trp Thr Thr Gin Glu Thr Phe 
-10 -5 1 

CCT CCA AAG TAC CH CAT TAT CAC GAA GAA ACC TCT CAT CAG CTG TTG 219 
Pro Pro Lys Trr Leu His Tyr Asp Glu Glu Thr Ser His Gin Leu Leu 
5 10 15 
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TCT GAC AM TCT CCT CCT CGT ACC TAC CTA AAA CAA CAC TCT ACA GCA 
Cys Asp Lys Cys Pro Pro Cly Thr Tyr Leu Lys Clo His Cys Thr Ala 
20 25 30 35 



267 



10 



15 



AAC TCG AAC ACC GTG TGC CCC CCT TCC CCT GAC CAC TAC TAC ACA GAC 
Lys Trp Lys Thr Val Cys Ala Pro Cys Pro Asp His Tyr Tyr Thr Asp 
40 45 50 



315 



AGC TGG CAC ACC AGT CAC GAG TGT CTA TAC TGC ACC CCC GTG TGC AAC 363 

20 

Ser Trp His Thr Ser Asp Clu Cys Leu Tyr Cys Ser Pro Yal Cys Lys 
55 60 65 

ss 

GAG CTC CAG TAC GTC AAC CAC GAG TGC AAT CGC ACC CAC AAC CCC GTG 411 
Glu Leu Clo Tyr Val Lys Clo Glu Cys Aso Ars Thr His Asa Ar; Val 

30 

70 75 80 

3« TCC CAA TCC AAC CAA CCC CGC TAC CTT CAC ATA GAG nC TCC nC AAA 459 
Cys Glu Cys Lys Clu Gly Arg Tyr Leu Clu lie Clu Phe Cys Leu Lys 
85 90 95 

40 

CAT ACC ACC TCC CCT CCT GGA TTT GGA CTC CTC CAA CCT C CTACCTGTCA 509 
His Arg Ser Cys Pro Pro Gly Pbe Gly Val Val Gin Ala 
100 105 110 

50 

ATGTGCAGCA AAATTMm GGATCATGCA AAGTCAGATA GTTCTGACAG TTTAG6ACAA 569 



55 
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CACTTTTGTT CTCATCACAT TATACCATAC CAAATTGCAA AGGTAATCAA ACCTCCCACC 629 

TAGGTACTAT CTCTCTGCAG TGCTTCCAAA GCACCATTGC TCAGAGGAAT ACTTTGCCAC 689 

TACACCGCAA TTTAATGACA AATCTCAAAT GCAGCAAATT ATTCTCTCAT CAGATCCATC 749 

ATGGTmTT TTTTriTTTT TAAACAAACA AACTCAAGTT CCACTATTCA TAGHGATCT 809 

ATACCTCTAT AmCACTTC AGCATGCACA CCTTCAAACT GCAGCACm nCACAAACA 869 

TCAGAAATCT TAATTTATAC CAACACAGTA ATTATCCTCA TAnAATCAG ACTCTCGAGT m 

GCTAACAATA AGCACHATA ATTAATTATC TAAAAAATGA GAATGCTCAC CCGAATTGCA 989 

rrrcAmn aaaaacaagg CTAGncnc ctttaccatc gcacctcact gtttgccacc 1049 

GTAAGGACTA TAGCAGAATC TCTTCAATGA CCTTATTCTT TATCTTAGAC AAAACACATT 1109 
CTCAACCCAA CACCAACCAC HCCCTATAA ACCAACTGCT nCTCTTTTG CATTTTGAAC 1169 
AGCATTGGTC ACCGCTCATG TCTATTCAAT CTTTTAAACC ACTAACCCAC G ITTriTlTC 1229 
TGCCACATn GCCAACCTTC ACTCCACCCT ATAACTTTTC ATAGCTTCAC AAAAHAAGA 1289 
GTATCCACn ACHAGATCG AACAACTAAT CAGTATAGAT TCTGATGACT CAGTTTCAAC 1349 
CAGTGTTTCT CAACTGAAGC CCTGCTCATA TTTTAACAAA TATCTCGATT CCTAGGCTCG 1409 
ACtCCTTTTT GTGGGCACCT CTCCTGCGCA nGTABAATT TTCCCAGCAC CCCTGCACTC 1469 
TAGCCACTAG ATACCAATAG CAGTCCnCC CCCATGTGAC AGCCAAAAAT CTCHCAGAC 1529 
ACTGTCAAAT CTCGCCAGGT GGCAAAATCA CTCCTGGTTG AGAACAGGGT CATCAATGCT 1589 
AACTATCTCT AACTATTTTA ACTCTCAAAA CTTGTGATAT ACAAACTCTA AAnATTAGA 1649 
CCACCAATAC nTAGGTTTA AAGGCATACA AATGAAACAT TCAAAAATCA AAATCTAnC 1709 
TCTTTCTCAA ATAGTGAATC HATAAAAH AATCACAGAA GATGCAAAH GCATCAGACT 1769 
CCCTTAAAAT TCCTCnCCT ATGAGTATn GAGGGAGGAA TTGGTGATAG nCCTACTTT 1829 
CTAnCCATC GTACTITGAG ACTCAAAAGC TAAGCTAAGT TGTGTGTGTG TCAGGGTCCG 1889 
GGCTGTCGAA TCCCATCACA TAAAAGCAAA TCCATGTAAT TCAnCAGTA ACTT6TATAT 1949 
GTAGAAAAAT GAAAAGTGGC CTATCCAGCT TGGAAACTAG AGAATTTTGA AAAATAATGC 2009 
AAATCACAAC CATCTTrCTT AAATAAGTAA GAAAATCTGT HGTAGAATG AAGCAAGCAG 2069 
CCA6CCAGAA GACTCAGAAC AAAAGTACAC ATmACTCT CTGTACACTG GCAGCACAGT 2129 
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CGCAnTAn TACCTCTCCC TCCCTAAAAA CCCACACACC GCnCCTCTT CGCAAATAAG 2189 
AGGTHCCAG CCCAAAGAGA AGCAAAGACT ATGTGGTGTT ACTCTAAAAA GTATTTAATA 2249 
ACCGTTTTGT TCTTCCTGTr GCTGTnTGA AATCAGATTG TCTCCTCTCC ATATTTTATT 2309 
TACTTCAnC TGTTAAnCC TGTGGAATTA CHAGAGCAA GCATGGTCAA nCTCAACTC 2369 
TAAACCCAAA TTTCTCCATC ATTATAATTT CACATTTTGC CTGGCAGGTT ATAATTTTTA 2429 
TAmCCACT CATACTAATA AGGTAAAATC AITACmGA TGGATACATC nTTTCATAA 2489 
AAAGTACCAT CAGTTATAGA GGGAAGTCAT GTTCATGTTC AGCAAGGTCA TTAGATAAAG 2549 
CTTCTCAATA TATTATGAAA CAHACnCT GTCATTCTTA CATTCTTm GTTAAATAAC 2609 
TTTAAAAGCT AACnACCTA AAAGAAATAT CTCACACATA TCAACTTCTC ATTACGATGC 2669 
AGGA6AACAC CCAACCCACA GATATGTATC TGAACAATCA ACAAGATTCT TAGGCCCGGC 2729 
ACGGTGGCTC ACATCTCTAA TCTCAAGACT TTGAGAGCTC AACGCGGGCA GATCACCTGA 2789 
CGTCAGGAGT TCAAGACCAG CCTGGCCAAC ATGATGAAAC CCTGCCTCTA CTAAAAATAC 2849 
AAAAATTAGC ACGGCATGCT GGTGCATGCC TGCAACCCTA GCTACTCACG AGGCTGACAC 2909 
ACGAGAATCT CTTGAACCCT CGACCCGGAG GTTGTCGTGA GCTGAGATCC CTCTACTCCA 2969 
CTCCAGCCTG GCTGACAGAG ATGACAPTCC GTCCCTGCCC CCGCCCCCCC CTTCCCCCCC 3029 
AAAAAGAnC TTCTTCATGC AGAACATACG GCAGTCAACA AAGGGAGACC TGCCTCCAGG 3089 
TGTCCAAGTC ACTTATTTCG AGTAAAHAC CAATGAAAGA ATGCCATGGA ATCCCTCCCC 3149 
AAATACCTCT CCHATGATA TTCTACAATT TGATATAGAC TTGTATCCCA TTTAAGGAGT 3209 
AGGATGTAGT AGGAAAGTAC TAAAAACAAA CACACAAACA CAAAACCCTC HTGCrnGT 3269 
AAGCTGGnC CTAACATAAT GTCAGTGCAA TGCTGGAAAT AATATTTAAT ATGTCAACGT 3329 
mAGGCTGT GTTTTCCCCT CCTCTTCTTT TTTTCTCCCA GCCCTTTGTC ATTTTTGCAC 3389 
CTCAATGAAT CATCTAGAAA GAGACAGGAG ATGAAACTAG AACCAGTCCA TTTTCCCCCT 3449 
TTTTTTAm TCTGGTTnG GTAAAAGATA CAATGAGCTA GGAGGTTGAG ArTTATAAAT 3509 
GAAGTHAAT AAGnTCTGT AGCmCAn TTTCTCnTC ATATrTGTTA TCTTGCATAA 3569 
GCCAGAATTG GCCTGTAAAA TCTACATATG GATAnCAAG TCTAAATCTG TTCAACTAGC 3629 
mCACTAGA TGGAGATAH TTCATATTCA GATACACTGG AATGTATGAT CTAGCCATGC 3689 
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GTAATATAGT CAACTCTnC AAGCTATTTA mTTAATAG CGTCTTTAGT TGTGCACTCG 3749 
TTCAACTTIT TCTGCCAATG ATTTCTTCAA ATTTATCAAA TATTTTTCCA TCATGAAGTA 3809 
AAATCCCCTT GCAGTCACCC TTCCTGAAGT nGAACGACT CTGCTGTTTT AAACACTTTA 3369 
ACCAAATGGT ATATCATCTT CCCmACTA TCTACCTTAA CTGCACGCn ACGCTTTTCA 3929 
CTCAGCCCCC AACTTTATTC CCACCTTCAA AACTTTAm TAATCnCTA AATTTTTACT 3989 
TCTCAACCn ACCATACnA CCAGTTCCTT CACAATTAGC ATTCAGGAAA CAAAGAACTT 4049 
CAGTAGCAAC TCATTGGAAT TTAATGATCC ACCATTCAAT CCCTACTAAT TTCAAAGAAT 4109 
GATAnACAG CACACACACA GCAGTTATCT TGATmCTA GGAATAATTG TATGAAGAAT 4169 
ATGGCTGACA ACACCCCCTT ACTGCCACTC AGCGGAGGCT GGACTAATGA ACACCCTACC 4229 
CTTCTTTCCT TTCCTCTCAC AITTCATGAG CGTTTTGTAC GTAACGAGAA AAnCACTTC 4289 
CATHGCAn ACAAGCAGCA CAAACTCGCA AACGGGATGA TGGTCCAACT TTTGnCTGT 4349 
CTAATGAAGT GAAAAATGAA AATCCTAGAG TTnGTGCAA CATAATACTA GCACTAAAAA 4409 
CCAACTGAAA AGTCTTTCCA AAACTGTGTT AACAGGGCAT CTGCTGGGAA ACGATTTCAC 4469 
CAGAAGCTAC TAAAnCCn GGTATTnCC GTAG GA ACC CCA GAG CGA AAT ACA 4523 

Gly Thr Pro Glu Arg Asn Thr 
115 

GTT TGC AAA AGA TCT CCA CAT GGG TTC TTC TCA AAT GAG ACC TCA TCT 4571 
Val Cys Lys Arg Cys Pro Asp Gly Phe Phc Ser Asn Glu Thr Ser Ser 
120 125 130 135 

AAA CCA CCC TGT AGA AAA CAC ACA AAT TGC ACT GTC m GGT CTC CTC 4619 
Lys Ala Pro Cys Arg Lys His Thr Asn Cys Ser Val Phe Cly Leu Leu 
140 145 150 

CTA ACT CAG AAA GGA AAT CCA ACA CAC GAC AAC ATA TGT TCC GGA AAC 4667 
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Leu Thr Cln Lys Oly Asa Ala Thr His Asp Asq He Cys Ser Cly Asn 
155 ISO 165 

ACT CAA TCA ACT CAA AAA TCT CCA ATA G CTAAHACAT TCCAAAATAC 4715 
Ser GIu Ser Thr Cla Lys Cys Gly He 
170 175 

GTCTnOTAC GATTTTGTAG TATCATCTCT CTCTCTGAGT TGAACACAAG GCCTCCACCC 4775 
ACATTCnCC TCAAACTTAC ATTTrCCCTT TCTTGAATCT TAACCACCTA ACCCTACTCT 4835 
CGATGCATTA CTGCTAAACC TACCACTCAC AATCTCTCAA AAACTCATCT TCTCACACAT 4895 
AACACCTCAA AGCTTCATTT TCTCTCCTTT CACACTCAAA TCAAATCTTG CCCATAGCCA 4955 
AACGGCAGTG TCAAGTHGC CACTGACATG AAAHAGGAG AGTCCAAACT CTAGAAnCA 5015 
CCnGTCTGT TAnACTTTC ACGAATGTCT GTAnAHAA GTAAAGTATA TATTGGCAAC 5075 
TAAGAAGCAA AGIGATATAA ACATGATGAC AAAHAGCCC AGCCATGGTG GCmCTCCT 5135 
ATAATCCCAA CATTTTCCGC CCCCAACCTA CCCAGATCAC TTCAGGTCAC GATTTCAACA 5195 
CCACCCTCAC CAACATGGTG AAACCTTGTC TCTACTAAAA ATACAAAAAT TACCTGCGCA 5255 
TGGTAGCAGC CACTTCTAGT ACCACCTACT CAGCGCTGAG GCAGGAGAAT CGCTTGAACC 5315 
CACCACATGC ACGTTCCACT GACCTCAGAT TCTACCACTG CACTCCAGTC TGGCCAACAC 5375 
AGCAAGATTT CATCACACAC ACACACACAC ACACACACAC ACACAnAGA AATGTGTACT 5435 
TGGCrrrCTT ACCTATCGTA nAGTGCATC TATTGCATGC AACHCCAAG CTACTCTGGT 5495 
TGTCTTAAGC TCnCATTGC CTACAGGTCA CTACTATTAA CnCACCm TOCCATGCA 5555 
TTCCACGGTA GTCATGACAA TTCATCAGGC TACTCTCTCT CTTCACCnC TCACTCCCAC 5615 
CACTAGACTA ATCTCAGACC TTCACTCAAA GACACATTAC ACTAAAGATG ATTTGCTTTT 5675 
TTCTGnTAA TCAAGCAATC GTATAAACCA GCTTCACTCT CCCCAAACAG TTTTrCGTAC 5735 
TACAAACAAC TTTATGAAGC AGACAAATCT GAAHGATAT ATATATGAGA nCTAACCCA 5795 
CTTCCAGCAT TGTTTCATTC TCTAATTGAA ATCATAGACA AGCCATTTTA CCCTTTGCTT 5855 
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TCTTATCTAA AAAAAAAAAA AAAAAAATCA ACCAAGGGCT ATTAAAAGCA GTCATCAAAT 5915 
TTTAACATTC TCTTTAATTA AITCATTTTT AATTTTACTT TTTnCAm AnCTGCACT 5975 
TACTATGTCG TACTCTGCTA TACACGCTTT AACATTTATA AAAACACTCT GAAACTTGCT 6035 
TCAGATGAAT ATACCTAGTA CAACCGCACA ACTAGTAnC AAAGCCAGGT CTGATCAATC 8095 
CAAAAACAAA CACCCATTAC TCCCATmC TCGCACATAC TTACTCTACC CACATGCTCT 6155 
CCCCTTTCTA ATCCCTATGT AAATAACATA CTTTTATGTT TGCnATTTT CCTATGTAAT 6215 
CTCTACTTAT ATATCTGTAT CTATCTCTTC CTTTGTrrCC AAAGGTAAAC TATGTCTCTA 6275 
AATCTCCGCA AAAAATAACA CACTATOCA AATTACTGTT CAAAHCCTT TAAGTCAGTC 6335 
ATAAnATTT GTTTTGACAT TAATCATCAA CTTCCCTGTG GCTACTAGGT AAACCTTTAA 6395 
TACAATCTTA ATGTTTGTAT TCAmTAAG AATnTTCGC TCnACnAT nACAACAAT 6455 
ATHCACTCT AAHAGACAT TTACTAAACT TTCTCTTCAA AACAATCCCC AAAAAAGAAC 6515 
ATTAGAAGAC ACGTAAGCTC AGTTCGTCTC TGCCACTAAG ACCAGCCAAC AGAACCTTGA 6575 
TTTTAnCAA ACTTTCCATT mCCATAn TTATCTTGGA AAAHCAAn GTGTTGCnT 6635 
TTTGmnC niGTAnCA ATACACTCTC AGAAATCCAA TTCTTCACTA AATCTTCTGG 6695 
6TTTTCTAAC CTTTCTTTAG AT CU ACC CTC TGT GAG GAG CCA TTC HC AGG 6747 

Asp Val Thr Leu Cys Glu Glu Ala Phe Phe Arg 
180 185 

TTT OCT CTT CCT ACA AAG TTT AGG CCT AAC TGC CTT AGT GTC TTC CTA 6795 
Phe Ala Val Pro Thr Lys Phe Thr Pro Asd Trp Leu Ser Val leu Val 
190 195 200 

GAC AAT TTG CCT GGC ACC AAA GTA AAC GCA GAG AGT GTA GAG AGG ATA 6843 
Asp Aso Leu Pro Glr Thr Lys Val Asn Ala Glu Ser Val Glu Arg He 
205 210 215 
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AAA CGG CAA CAC AGC TCA CAA GAA CAG ACT TTC CAG CTG CTG AAG TTA 6891 
Lys Arg Gin His Ser Ser Gin Clu Gin Thr Phe Glo Leu Leu Lys Leu 
220 225 230 235 

TGG AAA CAT CAA AAC AAA GAC CAA GAT ATA GTC AAG AAG ATC ATC CAA C 6940 
TrP Lys His Gin Aso Lys Asp Gin Asp He Val Lys Lys He He Gin 
240 245 250 

GTATGATAAT CTAAAATAAA AACATCAATC AGAAATCAAA CACACCTAn TATCATAAAC 7000 
CACCAACAAG ACTCCATCTA TGTTTACnG TCTCCATCTT CTTTCCCTCT TGCAATCATT 7060 
OnCGACTGA AAAAGTTTCC ACCTGATAAT CTAGATGTGA nCCACAAAC ACTTATACAA 7120 
GGTTTTGnC TCACCCCTGC TCCCCACTTT CCTTGTAAAG TATGHGAAC ACTCTAAGAC 7180 
AACACAAATC CATTTGAAGC CACCCCTGTA TCTCACCCAG TCGCTTCCAC ATCCCTTAAC 7240 
CCTTCTGTAA GCACCCCCTC TAGACCACCA AGCAGAACCT CTATAACCAC HTCTATCH 7300 
ACATTGCACC TCTACCAACA ACCTCTCnC TArTTACTTC GTAATTCTCT CCACGTAGGC 7360 
TTTTCGTAGC HACAAATAT GnCHATTA ATCCTCATGA TATGGCCTGC ATTAAAATTA 7420 
TTTTAATCGC ATATGHATG ACAATTAATC AGATAAAATC TGAAAAGTGT TTCAGCCTCT 7480 
TCTAGGAAAA ACCTACTTAC AGCAAAATGT TCTCACATCT TATAAGTHA TATAAAGAn 7540 
CTCCTHACA AATGGTGTGA CAGAGAAACA CAGAGACATA GGGAGACAAC TCTGAAAGAA 7600 
TCTGAAGAAA AGGAGTHCA TCCAGTCTGG ACTGTAAGCT TTACCACACA TCATGGAAAC 7660 
AGTTCTGACT TCAGTAACCA TrCCGAGCAC ATGCTAGAAG AAAAAGGAAG AAGAGTnCC 7720 
ATAATGCAGA CAGGGTCACT CAGAAAHCA HCAGGTCCT CACCAGTAGT TAAATGACTG 7780 
TATAGTCTTG CACTACCCTA AAAAACHCA AGTATCTGAA ACCGGGGCAA CAGATTTTAG 7840 
GACACCAACC TCTTTCAGAG CTGATTGCTT TTCCmTCC AAACAGTAAA CTTTTATGn 7900 
nGAGCAAAC CAAAACTATT CTTTGAACCT ATAATTAGCC CTGAACCCGA AAGAAAACAG 7960 
AAAATCAGAG ACCGHACAA HGGAAGCAA CCAAATTCCC TATHTATAA ATGAGGACAT 8020 
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TTTAACCCAG AAAGATCAAC CGATTTGGCT TACGCCTCAC ACATACTAAG TGACTCATCT 8080 
CAHAATAGA AATGTTAGTT CCTCCCTCTT AGCITTGTAC CCTACCTTAT TACTGAAATA 8140 
nCTCTACCC TCTCTCTCTC CTTTAGTTCC TCGACCTCAT GTCnTGACT mCAGATAT 8200 
CCTCCTCATG GAGGTAGTCC TCT6GTGCTA TCTGTATTCT TTAAAGGCTA CTTACGGCAA 8260 
HAACTTATC AACTAGCGCC TACTAATCAA ACTTTGTAn ACAAAGTAGC TAACTTGAAT 8320 
ACTTTCCTTT TTTTCTCAAA tCTTATCCTC GTAATTTCTC AAACTTTTTC TTACAAAACT 8380 
CAGACTCATG TGTCTTATTT TCTACTGHA ATITTCAAAA TTACCAGCTT CTTCCAAACT 8440 
mCTTGCAT CCCAAAAATA TATACCATAT TATCTTAm TAACAAAAAA TATTTATCTC 8500 
ACnCTTACA AATAAATCCT GTCACHAAC TCCCTCTCAA AAGAAAAGGT TATCAnCAA 8560 
ATATAAHAT CAAAHCTCC AAGAACCnT TGCCTCACGC TTGTTTTATG ATGCCATTCC 8620 
ATCAATATAA ATGATGTCAA CACnATCTC GCCTTTTCCT miCCAG AT AIT CAC 8676 

Asp He Asp 

CTC TCT GAA AAC ACC GTG CAG CGG CAC ATT CGA CAT GCT AAC CTC ACC 8724 
Leu Crs GIu AsD Ser Val Glo Arg His He Cly His Ala Asa Leu Thr 
255 260 265 270 

TTC CAG CAG CTT CGT AGC TK ATG GAA AGO HA COG GCA AAC AAA CTC 8772 
Phe Glu Gin Leu Arg Ser Leu Vet Glu Ser Leu Pro Glr Lys Lys Val 
275 280 285 

GGA GCA GAA GAC AH GAA AAA ACA ATA AAG GCA TGC AAA CCC ACT CAC 8820 
Gly Ala Glu Asp He Glu Lys Thr He Lys Ala Cys Lys Pro Ser Asp 
290 295 300 

CAC ATC CTC AAC CTC CTC ACT TTG TGG CCA ATA AAA AAT GCC CAC CAA 8868 
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Gin He Leu Lys Leu Leu Ser Leu Trp Arg lie Lys Asn Gly Asp Gin 
305 310 315 

GAG ACC TTC AAG GCC OTA ATC CAC CCA CTA AAC CAC TCA AAG ACG TAG 8316 
Asp Thr Leu Lys Gly Leu Uet His Ala Leu Lys His Ser Lys Thr Tyr 
320 325 330 

CAC TTT CCC AAA ACT GTC ACT CAG AGT CTA AAC AAG ACC ATC AGG TTC 8964 
His Phe Pro Lys Thr Val Thr Gin Ser Leu Lys Lys Thr He Arg Phe 
335 340 345 350 

CTT CAC ACC TTC ACA ATC TAC AAA TTC TAT CAC AAG TTA TTT TTA CAA 9012 
Leu His Ser Phe Thr Uet Tyr Lys Leu Tyr Clo Lys Leu Phe Leu Clu 
355 360 365 

ATG ATA GCT AAC CAC CTC CAA TCA CTA AAA ATA AGC TCC m 9054 
Met He Gly Asn Glo Val Gin Ser Val Lys He Ser Cys Leu 
370 375 380 

TAACTGCAAA TCCCCATTCA CCTGTTTCCT CACAATTGCC CACATCCCAT CGATCACTAA 9114 
ACTGTHCTC AGGCACnGA GGCTTTCAGT CATATCTTTC TCAnACCAC TCACTAATH 9174 
TCCCACAGGG TACTAAAACA AACTATCATG TGGAGAAACC ACTAACATCT CCTCCAATAA 8234 
ACCCCAAATG GTTAATCCAA CTGTCACATC TCCATCCTTA TCTACTGACT ATAmTCCC 9294 
TTATTACTGC TTGCAGTAAT TCAACTGCAA AHAAAAAAA AAAAACTACA CTCCACTCGG 9354 
CCmCTAAA TATGCGAATC TCTAACTTAA ATAGCTHGC CATTCCACCT ATGCTAGAGG 9414 
CTTTTATTAC AAAGCCATAT TTTTTTCTGT AAAACTTACT AATATATCTG TAACACTAH 9474 
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ACAGTATTGC TATTTATAn CATTCAGATA TAAGATTTGC ACATAHATC ATCCTATAAA 9534 
CAAACGGTAT GACnAATTT TAGAAAGAAA AmTATTCT CTTrATTATG ACAAATCAAA 9594 
CAGAAAATAT ATATTTTTAA TGCAAACTTT GTACCATTTT TCTAATACCT ACTGCCATAT 9654 
TTTTCTCTCT CCAGTATTTT TATAATTITA TCTCTATAAC CTGTAATATC ATTTTATAGA 9714 
AAATGCATTA nTACTCAAT TGTTTAATGT TGCAAAACAT ATCAAATATA AATTATCTCA m\ 
ATAnACATC CTCTCAGAAA TTCAATCTAC CTIATTTAAA AGATTTTATP CTTTTATAAC 9834 
TATATAAATG ACATTATTAA ACTTnCAAA nATTTmA TTCCTTTCTC TCTTCCTTTT 9894 
ATTT 9898 

Sequence nvunber: 3 
Length of sequence: 401 
Sequence Type: amino acid 
Strandedness : single stranded 
Topology: linear 
Molecular type: protein 



Sequence: 

Ifet Asn Asn Leu Leu Cys Cys Ala Leu Yal Pbe Leu Asp lie Ser 
-20 -15 -10 

He Lys Trp Thr Thr Cln Clu Thr Phe Pro Pro Lys Tyr Leu His 
-5 I 5 

Tyr Asp Glu Glu Thr Ser His Gin Leu Leu Cys Asp Lys Cys Pro 

10 15 20 

Pro Cly Thr Tyr Leu Lys CIq His Cys Thr Ala Lys Trp Lys Thr 

25 SO 35 

Val Cys Ala Pro Cys Pro Asp His Tyr Tyr Thr Asp Ser Trp His 

40 45 50 
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He GlQ Asp He Asp Leu Cys Clu Aso Ser Val Gla Arg His He 

250 255 260 

Gly His Ala Asn Leu Thr Phe Glu Gin Leu Arg Ser Leu tfet Glu 

265 270 275 

Ser Leu Pro Gly Lys Lys Val Gly Ala Glu Asp He Glu Lys Thr 

280 285 290 

He Lys Ala Cys Lys Pro Ser Asp Gla He Leu Lys Leu Leu Ser 

295 300 305 

Leu Trp Arg He Lys Aso Gly Asp Gin Asp Thr Leu Lys Gly Leu 

310 315 320 

Met His Ala Leu Lys His Ser Lys Thr Tyr His Phe Pro Lys Thr 

325 330 335 

Val Thr Gin Ser Leu Lys Lys Thr He Arg Phe Leu His Ser Phe 

340 345 350 

Tlir Met Tyr Lys Leu Tyr Gin Lys Leu Phe Leu Glu Met He Gly 

355 360 365 

Aso Gin Val Gin Ser Val Lys He Ser Cys Leu 

370 375 380 

Sequence number: 4 
Length of sequence: 1206 
Sequence Type : nucleic acid 
Strandedness : single stranded 
Topology: linear 
Molecular type: cDNA 
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Sequence : 
ATGAACAACT TCCTCTCCTC CCCCCTCCTC 

CAGGAAACGT TTCCTCCAAA CTACCTTCAT 

TCTCACAAAT GTCCTCCTGG TACCTACCTA 

GTGTCCGCCC CTTCCCCTGA CCACTACTAC 

CTATACTGCA CCCCCGTCTG CAAGGAGCTG 

CACAACCGCG TGTGCGAATG CAAGCAACGG 

CATACCAGCT GCCCTCCTGG ATTTCGAGTG 

GTTTCCAAAA GATCTCCAGA TGCCTTCTTC 

AGAAAACACA CAAAHCCAG TCTCTTTGGT 

CACCACAACA TATGTTCCGG AAACACTGAA 

CTGTGTCAGC AGCCATTCn CAGCTTTGCT 

AGTCTCTTGG TACACAAm GCCTGCCACC 

AAACGGCAAC ACAGCTCACA AGAACAGACT 

AACAAAGACC AACATATACT CAACAAGATC 

CTGCAGCCGC ACATTCGACA TGCTAACCTC 

AGCHACCGG GAAAGAAAGT GGGACCAGAA 

CCCAGTCACC AGATCCTGAA GCTGCTCAGT 

ACCnCAACG CCCTAATGCA CCCACTAAAG 

GTCACTCACA GTCTAAAGAA GACCATCAGC 

TATCAGAAGT TATTTTTACA AATCATAGGT 

TTATAA 



mCTGGACA TCTCCATTAA CTCCACCACC 60 
TATCACCAAC AAACCTCTCA TCAGCTGTTC 120 
AAACAACACT CTACAGCAAA GTGGAAGACC 180 
ACAGACAGCT GGCACACCAG TGACCACTGT 240 
CACTACCTCA ACCAGGACTG CAATCGCACC 300 
CGCTACCTTG AGATAGAGTT CTCCTTCAAA 360 
GTGCAAGCTG GAACCCCAGA GCGAAATACA 420 
TCAAATGAGA CCTCATCTAA AGCACCCTCT 480 
CTCCTCCTAA CTCAGAAAGG AAATGCAACA 540 
TCAACTCAAA AATCTCCAAT ACATGTTACC 600 
GTTCCTACAA ACTTTACCCC TAACTCCCTT 660 
AAAGTAAACC CACAGAGTGT AGAGAGGATA 720 
nCCAGCTGC TGAAGHATG GAAACATCAA 780 
ATCCAAGATA HGACCTCTG TGAAAACAGC 840 
ACCTTCGAGC AGCTTCGTAG CTTGATCGAA 900 
GACAHGAAA AAACAATAAA GGCATGCAAA 960 
TTCTGGCGAA TAAAAAATGG CGACCAAGAC 1020 
CACTCAAAGA CGTACCACTT TCCCAAAACT 1080 
TTCCTTCACA GCHCACAAT GTACAAAnC 1140 
AACCAGGTCC AATCAGTAAA AATAAGCTGC 1200 

1206 
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SEQUENCE LISTING 
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(1) GENERAL INFORMATION: 

(i) APPLICANT: 

(A) NAME: SNOW BRAND MILK PRODUCTS CO . LTD 

(B) STREET: 1-1, NAEBOCHO 6-CHOME 

(C) CITY: HIGASHI-KU, SAPPORO -SHI 

(D) STATE: HOKKAIDO 

(E) COUNTRY: jp 

(P) POSTAL CODE (ZIP) : NONE 

lllm Zz^'""- '"^^^"NG PROTEIN 

(ill) OTJMBER OF SEQUENCES: 4 

(Iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy diek 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentin Release #1.0, Version #1.25 (EPO) 

(V) CURRENT APPLICATION DATA: 

APPLICATION NUMBER: EP 9793S810.8 
(vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: JP 235928/96 

(B) FILING DATE: 19-AUG-199S 

(2) INFORMATION FOR SEQ ID N0:1: 
<i) 



SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1316 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNBSS : double 
30 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA (human OCIF genomic DNA-1) 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 

S^^^Sit! CACTTGGCCC TGATGGGGAA GCAGCTCTGC AQGGACTTTT 60 

35 TCAGCCATCT GTAAACAATT TCAGTGGCAA CCCGCGAACT GTAATCCATG AATQOGACCA 120 

™r^^ GTCATCAAGT CTAACTTCTA GACCAGGGAA TTAAT^^ ^l^^ l"o 
^T^^ AAGTGCCAAA CTTCTGTCGA TAGCTTOAGG CTAGTG6AAA GACCTCGAGG 240 
^Jf^"^ A6AAGTTCAG CGCGTAGGAA GCTCCQATAC CAATAGCCCT TTGATGATGO 300 
TGGOGTTGOT GAAGGGAACA GTGCTCCGCA AGGTTATCCC TGCCCCAGGC AGTCCW^TTT 360 
"^Tr^JSt '"^"^^'^^ GGCTCTAACT ACCCCAGATA ACA^^G^^ d^^l 420 
40 AGCACGGGCT TTAGGGCCAA TCAGACATTA GTTAQAAAAA TTCCTACTAC ATGGTTTOTG 480 

"0 TAAACTTGAA GAT6AATGAT TGCGAACTCC CCGAAAAGGG CTCAGaSaT GCC^GCATA 540 

CTGTAATTTG AGGTTTCAGA ACCCGAAGTG AAGG^^^J^G S^GT 600 
ACGGCGQAAA CTCACAGCTT TCGCCCAGCG AGAGGACAAA GGTCTGGGAC ACACTCCAAC 660 
TGCGTCCGGA TCTTGOCTGG ATCGGACTCT CAGGGTGGAG GAGACACAAG CACAGCA^ 720 
S^^l "^^"^ TCCCACCGCT GGTCCCGGCT GCCAGG^ TO^^ 
nr^^I;^° CCGGGAAACC TCAQAGCCCC GCGQAGACAQ CAGCCGCCTT GTTCCTCAGC 840 
45 CCGOTOGCTT TTTTTTCCCC TGCTCTCCCA GGGGACAGAC ACCACCGCCC CACCCCTCAC 900 

GCCCCACCTC CCTGGGOGAT CCTTTCCGCC CCAGCCCTGA AAGCOTOAAT C^^^ 96^ 
TCTGCACACC CCCCGACCGC TCCCGCCCAA GCTTCCTAAA AAAGAAaSSt llH 
GTCCAGGATA GAAAAATGAC TGATCAAAGG CAGGCGATAC TTCCTGTTGC COG^CGCTA 1080 
r^rl^^'' °*^«=«CA CGGGCTGCGG AGACGCACCG GAGCGCTCO^ C^C^G U« 
CCTCCAAGCC CCTGAGQTTT CCGGGGACCA CA ATG AAC AAG TTG CTG T<^^ 1193 
50 Met Aen Lys Leu Leu Cys Cye 

•20 -15 

GCG CTC GTG GTAAGTCCCT GGGCCAGCCQ ACGGGTGCCC GGCGCCTGGG 1242 
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Ala Leu val 

GAGGCTGCTG CCACCTGGTC TCCCAACCTC CCAGCGGACC GGCGGGGAGA AGGCTCCACT 1302 
CGCTCCCTCC CAGG 1316 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9898 base pairs 
10 (B) TYPE: nucleic acid 

(C) 8TRANDBDNBSS : double 

(D) TOPOLOGY: linear 

Cii) MOLECULE TYPE: genomic DNA .(human OCIF genomic DNA-2) 



15 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:2: 

GCTTACTTTG TGCCAAATCT CATTAGGCTT AAGGTAATAC AGGACTTTGA GTCAAATGAT 60 
ACTGTTGCAC ATAAGAACAA ACCTATTTTC ATGCTAAGAT GATGCCACTG TGTTCCTTTC 120 
TCCTTCTAG TTT CTG GAC ATC TCC ATT AAG TGG ACC ACC CAG GAA ACG TTT 171 
Phe Leu Asp He Ser He Lys Trp Thr Thr Gin Glu Thr Phe 
-10 -5 1 

20 CCT CCA AAG TAC CTT CAT TAT GAC GAA GAA ACC TCT CAT CAG CTG TTG 219 

Pro Pro Lys Tyr Leu His Tyr Asp Glu Glu Thr Ser His Gin Leu Leu 
5 10 15 



25 



30 



35 



40 



TGT GAC AAA TGT CCT CCT GGT ACC TAC CTA AAA CAA CAC TGT ACA GCA 267 
Cys ASP Lys Cys Pro Pro Gly Thr Tyr Leu Lys Gin His Cys Thr Ala 
20 25 30 35 

AAG TGG AAG ACC GTG TGC GCC CCT TGC CCT GAC CAC TAC TAC ACA GAC 315 
Lys Trp Lys Thr Val Cys Ala Pro Cys Pro Asp His Tyr Tyr Thr Asp 
40 45 50 

AGC TGG CAC ACC AGT GAC GAG TGT CTA TAC TGC AGO CCC GTG TGC AAG 363 
ser Trp His Thr Ser Asp Glu Cys Leu Tyr Cys Ser Pro Val Cys Lys 
55 60 65 

GAG CTG CAG TAC GTC AAG CAG GAG TGC AAT CGC ACC CAC AAC CGC GTG 411 
Glu Leu Gin Tyr Val Lys Gin Glu Cys Asn Arg Thr His Asn Arg val 
70 75 80 

TGC GAA TGC AAG GAA GGG CGC TAC CTT GAG ATA GAG TTC TGC TTG AAA 459 
Cys Glu cys Lys Glu Gly Arg Tyr Leu Glu He Glu Phe Cys Leu Lys 
85 90 95 

CAT AGG AGC TGC CCT CCT GGA TTT GGA GTG GTG CAA GCT G GTACGTGTCA 509 
His Arg Ser Cys Pro Pro Gly Phe Gly Val Val Gin Ala 
100 105 110 



ATGTGCAGCA AAATTAATTA GGATCATGCA AAGTCAGATA GTTGTGACAG TTTAGGAGAA 569 
CACTTTTGTT CTGATGACAT TATAGGATAG CAAATTGCAA AGGTAATGAA ACCTGCCAGG 629 
TAGGTACTAT GTGTCTGGAG TGCTTCCAAA GGACCATTGC TCAGAGGAAT ACTTTGCCAC 689 
TACAGGGCAA TTTAATGACA AATCTCAAAT GCAGCAAATT ATTCTCTCAT GAGATGCATG 749 

^5 ATGGTTTTTT tTTTTTTTTT TAAAGAAACA AACTCAAGTT GCACTATTGA TAGTTQATCT 809 

ATACCTCTAT ATTTCACTTC AGCATGGACA CCTTCAAACT GCAGCACTTT TTGACAAACA 869 
TCAGAAATGT TAATTTATAC CAAGAGAGTA ATTAT6CTCA TATTAATGAG ACTCTGGAGT 929 
GCTAACAATA AQCAGTTATA ATTAATTATG TAAAAAATGA OAATGGTGAG GGGAATTGCA 989 
TTTCATTATT AAAAACAAGG CTAGTTCTTC CTTTAGCATG GGAGCTGAGT GTTTGGGAGG 1049 
GTAAGGACTA TAGCAGAATC TCTTCAATGA GCTTATTCTT TATCTTAGAC AAAACAGATT 1109 

SO GTCAAGCCAA GAGCAAGCAC TTGCCTATAA ACCAAGTGCT TTCTCTTTTO CATTTTGAAC 1169 

AGCATTGGTC AGGGCTCATG TGTATTGAAT CTTTTAAACC AGTAACCCAC GTTTTTTTTC 1229 
TGCCACATTT GCGAAGCTTC AGTGCAGCCT ATAACTTTTC ATAGCTTGAG AAAATTAAGA 1289 
GTATCCACTT ACTTAGATGG AAGAAGTAAT CAGTATAGAT TCTGATGACT CAGTTTGAAG 1349 
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CAGTGTTTCT CAACTGAAGC CCTGCTGATA TTTTAAGAAA TATCTGGATT CC?rAGGCTGG 1409 
ACTCCTTTTT GTGGGCAGCT GTCCTGCGCA TTGTAGAATT TTGGCAGCAC CCCTGGACTC 1469 
TAGCCACTAG ATACCAATAG CAGTCCTTCC CCCATGTGAC AGCCAAAAAT GTCTTCAGAC 1529 
ACTGTCAAAT GTCGCCAGGT GGCAAAATCA CTCCTGGTTG AGAACAGGGT CATCAATGCT 1589 
^ AAGTATCTGT AACTATTTTA ACTCTCAAAA CTTGTGATAT ACAAAGTCTA AATTATTAGA 1649 

CGACCAATAC TTTAGGTTTA AAGGCATACA AATGAAACAT TCAAAAATCA AAATCTATTC 1709 
TGTTTCTCAA ATAGTGAATC TTATAAAATT AATCACAGAA GATGCAAATT GCATCAGAGT 1769 
CCCTTAAAAT TCCTCTTCGT ATGAGTATTT GAGGGAGGAA TTGGTGATAG TTCCTACTTT 1829 
CTATTGGATG GTACTTTGAG ACTCAAAAGC TAAGCTAAGT TGTGTGTGTG TCAQGGTGCG 1889 
GGGTGTGQAA TCCCATCAGA TAAAAGCAAA TCCATGTAAT TCATTCAGTA AGTTGTATAT 1949 
10 GTAGAAAAAT GAAAAGTGGG CTATGCAGCT TGGAAACTAG AGAATTTTGA AAAATAATGG 2009 

AAATCACAAG GATCTTTCTT AAATAAGTAA GAAAATCTGT TTGTAGAATG AAGCAAGCAG 2069 
GCAGCCAGAA GACTCAGAAC AAAAGTACAC ATTTTACTCT GTGTACACTG GCAGCACAGT 2129 
GGGATTTATT TACCTCTCCC TCCCTAAAAA CCCACACAQC GGTTCCTCTT GGGAAATAAG 2189 
AGGTTTCCAG CCCAAAGAGA AGGAAAGACT ATGTGGTOTT ACTCTAAAAA GTATTTAATA 2249 
ACCGTTTTGT TGTTGCTGTT GCTGTTTTGA AATCAGATTG TCTCCTCTCC ATATTTTATT 2309 
^5 TACTTCATTC TGTTAATTCC TGTGGAATTA CTTAGAGCAA GCATGGTGAA TTCTCAACTG 2369 

TAAAGCCAAA TTTCTCCATC ATTATAATTT CACATTTTGC CTGGCAGGTT ATAATTTTTA 2429 
TATTTCCACT GATAGTAATA AGGTAAAATC ATTACTTAGA TGGATAGATC TTTTTCATAA 2489 
AAAGTACCAT CAGTTATAGA GGGAAGTCAT GTTCATGTTC AGGAAGQTCA TTAGATAAAG 2549 
CTTCTGAATA TATTATGAAA CATTAGTTCT GTCATTCTTA GATTCTTTTT GTTAAATAAC 2609 
TTTAAAAGCT AACTTACCTA AAAGAAATAT CTGACACATA TGAACTTCTC ATTAGGATGC 2669 
AGGAGAAGAC CCAAGCCACA GATATGTATC TGAAGAATGA ACAAGATTCT TAGGCCCGGC 2729 
ACGGTGGCTC ACATCTGTAA TCTCAAGAGT TTGAGAGGTC AAGGCGGGCA GATCACCTGA 2789 
GGTCAGGAGT TCAAGACCAG CCTGGCCAAC ATQATGAAAC CCTGCCTCTA CTAAAAATAC 2849 
AAAAATTAGC AGGGCATGGT GGTGCATGCC TGCAACCCTA GCTACTCAGG AGGCTGAGAC 2909 
AGGAGAATCT CTTGAACCCT CGA6GCGGAG GTTGTGGTGA GCTGAGATCC CTCTACTGCA 2969 
CTCCAGCCTG GGTGACAGAG ATGAGACTCC GTCCCTGCCG CCGCCCCCGC CTTCCCCCCC 3029 
AAAAAGATTC TTCTTCATGC AGAACATACG GCAGTCAACA AAGGGAGACC TGGGTCCAGG 3089 
25 TGTCCAAGTC ACTTATTTCG AGTAAATTAG CAATGAAAGA ATGCCATGGA ATCCCTGCCC 3149 

AAATACCTCT GCTTATGATA TTGTAGAATT TGATATAGAG TTGTATCCCA TTTAAGGAGT 3209 
AGGATGTAGT AGGAAAGTAC TAAAAACAAA CACACAAACA GAAAACCCTC TTTGCTTTGT 3269 
AAGGTGGTTC CTAAGATAAT GTCAGTGCAA TGCTGGAAAT AATATTTAAT ATGTGAAGGT 3329 
TTTAGGCTGT GTTTTCCCCT CCTGTTCTTT TTTTCTGCCA GCCCTTTGTC ATTTTTGCAG 3389 
GTCAATGAAT CATGTAGAAA GAGACAGGAG ATGAAACTAG AACCAGTCCA TTTTGCCCCT 3449 
30 TTTTTTATTT TCTGGTTTTG GTAAAAGATA CAATGAGGTA GGAGGTTGAG ATTTATAAAT 3509 

GAAGTTTAAT AAGTTTCTGT AGCTTTGATT TTTCTCTTTC ATATTTGTTA TCTTGCATAA 3569 
GCCAGAATTG GCCTGTAAAA TCTACATATG GATATTGAAG TCTAAATCTG TTCAACTAGC 3629 
TTACACTAGA TGGAGATATT TTCATATTCA GATACACTGG AATGTATGAT CTAGCCATGC 3689 
GTAATATA6T CAAGTGTTTG AAGGTATTTA TTTTTAATAG CGTCTTTAGT TGTGGACTGG 3749 
TTCAAGTTTT TCTGCCAATG ATTTCTTCAA ATTTATCAAA TATTTTTCCA TCATGAAGTA 3809 
AAATGCCCTT GCAGTCACCC TTCCTGAAGT TTGAACGACT CTGCTGTTTT AAACAGTTTA 3869 
AGCAAATGGT ATATCATCTT CCGTTTACTA TGTAGCTTAA CTGCAGGCTT ACGCTTTTGA 3929 
GTCAGCGGCC AACTTTATTG CCACCTTCAA AAGTTTATTA TAATGTTGTA AATTTTTACT 3989 
TCTCAAGGTT AGCATACTTA GGAGTTGCTT CACAATTAGG ATTCAGGAAA GAAAGAACTT 4049 
CAGTAGGAAC TGATTGGAAT TTAATGATGC AGCATTCAAT GG6TACTAAT TTCAAAGAAT 4109 
GATATTACAG CAGACACACA GCAGTTATCT TGATTTTCTA GGAATAATTG TATGAAGAAT 4169 
ATGGCTGACA ACACGGCCTT ACTGCCACTC AGCGGAGGCT GGACTAATGA ACACCCTACC 4229 
40 CTTCTTTCCT TTCCTCTCAC ATTTCATGAG CGTTTTGTAG GTAACGAGAA AATTGACTTG 4289 

CATTTGCATT ACAAGGAGGA GAAACTGGCA AAGGGGATGA TGGTGGAAGT TTTGTTCTGT 4349 
CTAATGAAGT GAAAAATGAA AATGCTAGAG TTTTGTGCAA CATAATAGTA GCAGTAAAAA 4409 
CCAAGTGAAA AGTCTTTCCA AAACTGTGTT AAGAGGGCAT CTGCTGGGAA ACGATTTGAG 4469 
GAGAAGGTAC TAAATTGCTT GGTATTTTCC GTAG GA ACC CCA GAG CGA AAT ACA 4523 

Gly Thr Pro Glu Arg Asn Thr 
45 115 

GTT TGC AAA AGA TGT CCA GAT GGG TTC TTC TCA AAT GAG ACG TCA TCT 4571 
val CVS LVB Arg Cys Pro Asp Gly Phe Phe Ser Asn Glu Thr Ser Ser 
120 125 130 135 

50 AAA GCA CCC TGT AGA AAA CAC ACA AAT TGC AGT GTC TTT GGT CTC CTG 4619 

Lys Ala Pro Cys Arg Lys His Thr Asn Cys Ser Val Phe Gly Leu Leu 
140 145 150 

CTA ACT CAG AAA GGA AAT GCA ACA CAC GAC AAC ATA TGT TCC GGA AAC 4667 
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Leu Thr Gin Lys Gly Asn Ala Thr His Asp Asn II© Cys Ser Gly Asn 
155 160 165 

ACT GAA TCA ACT CAA AAA TGT GGA ATA G GTAATTACAT TCCAAAATAC 4715 

^ Ser Glu Ser Thr Gin Lys Cys Gly lie 

170 175 

GTCTTTGTAC GATTTTGTAG TATCATCTCT CTCTCTGAGT TGAACACAAG GCCTCCAGCC 4775 
ACATTCTTGG TCAAACTTAC ATTTTCCCTT TCTTGAATCT TAACCAGCTA AGGCTACTCT 4835 
CGATGCATTA CTGCTAAAGC TACCACTCAG AATCTCTCAA AAACTCATCT TCTCACAGAT 4895 
10 AACACCTCAA AGCTTGATTT TCTCTCCTTT CACACTQAAA TCAAATCTTG CCCATAGGCA 4955 

AAGGGCAGTG TCAAGTTTGC CACTGAGATG AAATTAGGAG AGTCCAAACT GTAGAATTCA 5015 
CGTTGTGTGT TATTACTTTC ACGAATGTCT GTATTATTAA CTAAAGTATA TATTGGCAAC 5075 
TAAGAAGCAA AGTGATATAA ACATGATGAC AAATTAGGCC AGGCATGGTG GCTTACTCCT 5135 
ATAATCCCAA CATTTTGGOQ GGCCAAGGTA GGCAGATCAC TTQAGGTCAG GATTTCAAGA 5195 
CCAGCCTGAC CAACATGGTG AAACCTTGTC TCTACTAAAA ATACAAAAAT TAGCTGGGCA 5255 
j5 TGGTAGCAGG CACTTCTAGT ACCAGCTACT CAGGGCTGAG GCAGGAGAAT CGCTTGAACC 5315 

CAGGAGATGG AGGTTGCAGT GAGCTGAGAT TGTACCACTG CACTCCAGTC TGQGCAACAG 5375 
AGCAAGATTT CATCACACAC ACACACACAC ACACACACAC ACACATTAGA AATGTGTACT 5435 
TGGCTTTGTT ACCTATGGTA TTAGTGCATC TATTGCATGG AACTTCCAAG CTACTCTGGT 5495 
TGTGTTAAGC TCTTCATTGG GTACAGGTCA CTAGTATTAA GTTCAGGTTA TTCGGATGCA 5555 
TTCCACGGTA GTGATGACAA TTCATCAGGC TAGTGTGTGT GTTCACCTTG TCACTCCCAC 5615 
CACTAGACTA ATCTCAGACC TTCACTCAAA GACACATTAC ACTAAA6ATG ATTTGCTTTT 5675 
TTGTGTTTAA TCAAGCAATG GTATAAACCA GCTTGACTCT CCCCAAACAG TTTTTCGTAC 5735 
TACAAAGAAG TTTATGAAGC AGAGAAATGT GAATTGATAT ATATATGAGA TTCTAACCCA 5795 
GTTCCAGCAT TGTTTCATTG TGTAATTGAA ATCATAGACA AGCCATTTTA GCCTTTGCTT 5855 
TCTTATCTAA AAAAAAAAAA AAAAAAATGA AGGAAGGGGT ATTAAAAGGA GTGATCAAAT 5915 
TTTAACATTC TCTTTAATTA ATTCATTTTT AATTTTACTT TTTTTCATTT ATTGTGCACT 5975 
TACTATGTGG TACTGTGCTA TAGAGGCTTT AACATTTATA AAAACACTGT GAAAGTTGCT 6035 
25 TCAGATGAAT ATAGGTAGTA GAACGGCAGA ACTAGTATTC AAAGCCAGGT CTGATGAATC 6095 

CAAAAACAAA CACCCATTAC TCCCATTTTC TGGGACATAC TTACTCTACC CAGATGCTCT 6155 
GGGCTTTGTA ATGCCTATGT AAATAACATA GTTTTATGTT TGGTTATTTT CCTATGTAAT 6215 
GTCTACTTAT ATATCTGTAT CTATCTCTTG CTTTGTTTCC AAAGGTAAAC TATGTGTCTA 6275 
AATGTGGGCA AAAAATAACA CACTATTCCA AATTACTGTT CAAATTCCTT TAAGTCAGTG 6335 
ATAATTATTT GTTTTGACAT TAATCATGAA GTTCCCTGTG GGTACTAGGT AAACCTTTAA 6395 
TAGAATGTTA ATGTTTGTAT TCATTATAAG AATTTTTGGC TGTTACTTAT TTACAACAAT 6455 
ATTTCACTCT AATTAGACAT TTACTAAACT TTCTCTTGAA AACAATGCCC AAAAAAGAAC 6515 
ATTAGAAGAC ACGTAAGCTC AGTTGGTCTC TGCCACTAAG ACCAGCCAAC AGAAGCTTOA 6575 
TTTTATTCAA ACTTTGCATT TTAGCATATT TTATCTTGGA AAATTCAATT GTGTTGGTTT 6635 
TTTGTTTTTG TTTGTATTGA ATAGACTCTC AGAAATCCAA TTGTTGAGTA AATCTTCTGG 6695 
GTTTTCTAAC CTTTCTTTAQ AT GTT ACC CTG TGT GAG GAG GCA TTC TTC AGG 6747 

Asp Val Thr Leu Cys Glu Glu Ala Phe Phe Arg 
35 1 80 1 85 

TTT GCT GTT CCT ACA AAG TTT ACG CCT AAC TGG CTT AGT 6TC TTG GTA 6795 
Phe Ala val Pro Thr Lys Phe Thr Pro Asn Trp Leu Ser Val Leu Val 
190 195 200 



30 



40 



45 



OAC AAT TTG CCT GGC ACC AAA GTA AAC GCA GAG AGT GTA GAG AGG ATA 6843 
Asp Asn Leu Pro Gly Thr Lys Val Asn Ala Glu Ser Val Glu Arg He 
205 210 215 

AAA CGG CAA CAC AGC TCA CAA GAA CAG ACT TTC CAG CTG CTG AAG TTA 6891 
Lys Arg Gin His Ser Ser Gin Glu Gin Thr Phe Gin Leu Leu Lys Leu 
220 225 230 235 

TGG AAA CAT CAA AAC AAA GAC CAA GAT ATA GTC AAG AAG ATC ATC CAA G 6940 
Trp Lys His Gin Asn Lys Asp Gin Asp He Val Lys Lys Ho He Gin 
240 245 250 

GTATGATAAT CTAAAATAAA AAGATCAATC AQAAATCAAA GACACCTATT TATCATAAAC 7000 
50 CAGGAACAAG ACTGCATGTA TGTTTAGTTG TGTGGATCTT GTTTCCCTQT TGGAATCATT 7060 

GTTGOACTGA AAAAGTTTCC ACCTQATAAT GTAGATGTGA TTCCACAAAC AGTTATACAA 7120 
GGTTTTGTTC TCACCCCTGC TCCCCAGTTT CCTTGTAAAG TATGTTGAAC ACTCTAAGAG 7180 
AAGAGAAATG CATTTGAAGQ CAGGGCTGTA TCTCAGGGAG TCGCTTCCAG ATCCCTTAAC 7240 
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GCTTCTGTAA GCAGCCCCTC TAGACCACCA AOGAGAAGCT CTATAACCAC TTTGTATCTT 7300 
ACATTGCACC TCTACCAAGA AGCTCTGTTG TATTTACTTG GTAATTCTCT CCAGGTAGGC 7360 
TTTTCGTAGC TTACAAATAT GTTCTTATTA ATCCTCATGA TATGGCCTGC ATTAAAATTA 7420 
TTTTAATGOC ATATGTTATG AGAATTAATG AOATAAAATC TGAAAAGTGT TTGAGCCTCT 7480 
TGTAGGAAAA AGCTAGTTAC AGCAAAATGT TCTCACATCT TATAAGTTTA TATAAAGATT 7540 
CTCCTTTAGA AATGGTGTGA GAGAOAAACA GAGAGAGATA GGGAGAGAAG TGTGAAAGAA 7600 
TCTGAAGAAA AGGAGTTTCA TCCAGTGTGG ACTGTAAGCT TTACGACACA TGATGGAAAG 7660 
AGTTCTGACT TCAGTAAGCA TTGGGAGGAC ATGCTAGAAG AAAAAGGAAG AAGAGTTTCX: 7720 
ATAATGCAGA CAGGGTCAGT GAGAAATTCA TTCAGGTCCT CACCAGTAGT TAAATGACTG 7780 
TATAGTCTTG CACTACCCTA AAAAACTTCA AGTATCTGAA ACCGGGGCAA CAGATTTTAG 7840 
GAGACCAACG TCTTTGAGAG CTGATTOCTT TTGCTTATGC AAAGAGTAAA CTTTTATGTT 7900 
TTGAGCAAAC CAAAAGTATT CTTTGAACGT ATAATTAGCC CTGAAGCCGA AAGAAAAGAG 7960 
AAAATCAGAG ACCGTTAGAA TTGGAAGCAA CCAAATTCCC TATTTTATAA ATGAGGACAT 8020 
TTTAACCCAG AAAGATGAAC CGATTTGGCT TAGGGCTCAC AGATACTAAG TGACTCATQT 8080 
CATTAATAGA AATGTTAGTT CCTCCCTCTT AGGTTTGTAC CCTAGCTTAT TACTGAAATA 8140 
TTCTCTAGGC TGTGTGTCTC CTTTAGTTCC TCGACCTGAT GTCTTTGAGT TTTCAGATAT 8200 
CCTCCTCATG GAGGTAGTCC TCTGGTGCTA TGTGTATTCT TTAAAGGCTA GTTACGGCAA 8260 
TTAACTTATC AACTAGCGCC TACTAATGAA ACTTTOTATT ACAAAGTAGC TAACTTGAAT 8320 
ACTTTCCTTT TTTTCTGAAA TGTTATGGTG GTAATTTCTC AAACTTTTTC TTAGAAAACT 8380 
GAGAGTGATG TGTCTTATTT TCTACTGTTA ATTTTCAAAA TTAGGAGCTT CTTCCAAAGT 8440 
TTTGTTGGAT GCCAAAAATA TATAGCATAT TATCTTATTA TAACAAAAAA TATTTATCTC 8500 
AGTTCTTAGA AATAAATGGT GTCACTTAAC TCCCTCTCAA AAGAAAAGGT TATCATTGAA 8560 
ATATAATTAT GAAATTCTGC AAGAACCTTT TGCCTCACGC TTGTTTTATG ATGGCATTGG 8620 
ATGAATATAA ATGATGTGAA CACTTATCTG GGCTTTTGCT TTATGCAG AT ATT GAC 8676 

Asp lie Asp 



CTC TGT GAA AAC AGO GTG CAG CGG CAC ATT GGA CAT GCT AAC CTC ACC 8724 
Leu Cys Glva Asn Ser val Gin Arg His lie Gly His Ala Asn Leu Thr 
255 260 265 270 

TTC GAG CAG CTT CGT AGC TTG ATG GAA AGO TTA CCG GGA AAQ AAA GTG 8772 
Phe Glu Gin Leu Arg Ser Leu Met Glu Ser Leu Pro Gly Lys Lys val 
275 280 285 



GGA GCA GAA GAC ATT GAA AAA ACA ATA AAG GCA TGC AAA CCC AGT GAC 
Gly Ala Glu Asp lie Glu Lys Thr lie Lys Ala Cys Lys Pro Ser Asp 
290 295 300 

CAG ATC CTG AAG CTG CTC AGT TTG TGG CGA ATA AAA AAT GGC GAC CAA 
Gin He Leu Lys Leu Leu Ser Leu Trp Arg He Lys Asn Gly Asp Gin 
305 310 315 



8820 



8868 



GAC ACC TTG AAG GGC CTA ATG CAC GCA CTA AAG CAC TCA AAG ACG TAG 8916 
Asp Thr Leu Lys Gly Leu Met His Ala Leu Lys His Ser Lys Thr Tyr 
320 325 330 

CAC TTT CCC AAA ACT GTC ACT CAG AGT CTA AAG AAG ACC ATC AGG TTC 8964 
HlB Phe Pro Lys Thr Val Thr Gin Ser Leu Lys Lys Thr He Arg Phe 
335 340 345 350 

CTT CAC AGC TTC ACA ATG TAG AAA TTG TAT CAG AAG TTA TTT TTA GAA 9012 
Leu His Ser Phe Thr Met Tyr Lys Leu Tyr Gin Lys Leu Phe Leu Glu 
355 360 365 

ATG ATA GGT AAC CAG GTC CAA TCA GTA AAA ATA AGC TGC TTA 
Met He Gly Asn Gin Val Gin Ser Val Lys He Ser Cys Leu 
370 375 380 

TAACTGGAAA TQGCCATTGA GCTGTTTCCT CACAATTGGC GAGATCCCAT GGATGAGTAA 9114 
ACTGTTTCTC AGGCACTTGA GGCTTTCAGT GATATCTTTC TCATTACCAG TGACTAATTT 9174 
TGCCACAGGG TACTAAAAGA AACTATGATG TGGAGAAAGG ACTAACATCT CCTCCAATAA 9234 
ACCCCAAATG GTTAATCCAA CTGTCAGATC TGGATCGTTA TCTACTGACT ATATTTTCCC 9294 
TTATTACTGC TTGCAGTAAT TCAACTGGAA ATTAAAAAAA AAAAACTAGA CTCCACTGGG 9354 
CCTTACTAAA TATGGGAATG TCTAACTTAA ATAGCTTTGG GATTCCAGCT ATGCTAGAGG 9414 
CTTTTATTAG AAAGCCATAT TTTTTTCTGT AAAAGTTACT AATATATCTG TAACACTATT 9474 
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ACAOTATTOC TATTTATATT CATTCAGATA TAAGATTTGG AGATATTATC ATCCTATAAA 9534 
OAAACGGTAT GACTTAATTT TAGAAAGAAA ATTATATTCT OTTTATTATQ ACAAATGAAA 9594 
GAGAAAATAT ATATTTTTAA TGGAAAGTTT GTAGCATTTT TCTAATAGGT ACTGCCATAT 9654 
TTTTCTGTGT GGAGTATTTT TATAATTTTA TCTGTATAAG CTGTAATATC ATTTTATAGA 9714 
AAATGCATTA TTTAGTCAAT TGTTTAATGT TGGAAAACAT ATGAAATATA AATTATCTGA 9774 
ATATTAGATG CTCTGAGAAA TTGAATGTAC CTTATTTAAA AGATTTTATG GTTTTATAAC 9834 
TATATAAATG ACATTATTAA AGTTTTCAAA TTATTTTTTA TTGCTTTCTC TGTTGCTTTT 9894 
ATTT 9898 



10 (2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 401 amino acids 

(B) TYPE: amino acid 

(C) STRAHDEDNESS: Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

Met Asn Asn Leu Leu Cys Cys Ala Leu Val Phe Leu Asp lie Ser 

-20 -15 -10 

lie Lys Trp Thr Thr Gin Glu Thr Phe Pro Pro Lys Tyr Leu His 

-5 15 
Tyr Asp Glu Glu Thr Ser His Gin Leu Leu Cys Asp Lys Cys Pro 

10 15 20 

Pro Oly Thr Tyr Leu Lys Gin His Cys Thr Ala Lys Trp Lys Thr 
25 30 35 

25 Val Cys Ala Pro Cys Pro Asp His Tyr Tyr Thr Asp Ser Trp His 

40 45 50 

Thr Ser Asp Glu Cys Leu Tyr Cys Ser Pro Val Cys Lys Glu Leu 

55 60 65 

Gin Tyr Val Lys Gin Glu Cys Asn Arg Thr His Asn Arg Val Cys 

70 75 80 

Glu Cys Lys Glu Gly Arg Tyr Leu Glu lie Glu Phe Cys Leu Lys 

85 90 95 

His Arg Ser Cys Pro Pro Gly Phe Gly Val Val Gin Ala Gly Thr 
100 105 110 

Pro Glu Arg Asn Thr Val Cys Lys Arg Cys Pro Asp Gly Phe Phe 
115 120 125 

Ser Asn Glu Thr Ser Ser Lys Ala Pro Cys Arg Lys His Thr Asn 
35 130 135 140 

Cys Ser Val Phe Gly Leu Leu Leu Thr Gin Lys Gly Asn Ala Thr 
145 150 155 

His Asp Asn He Cys Ser Gly Asn Ser Glu Ser Thr Gin Lys Cys 
160 165 170 

Gly He Asp Val Thr Leu Cys Glu Glu Ala Phe Phe Arg Phe Ala 
40 175 180 185 

Val Pro Thr Lys Phe Thr Pro Asn Trp Leu Ser Val Leu Val Asp 
190 195 200 

Asn Leu Pro Gly Thr Lys Val Asn Ala Glu Ser Val Glu Arg He 
205 210 215 

Lys Arg Gin His Ser Ser Gin Glu Gin Thr Phe Gin Leu Leu Lys 
220 225 230 

Leu Trp Lys His Gin Asn Lys Asp Gin Asp He Val Lys Lys He 
235 240 245 

He Gin Asp He Asp Leu Cys Glu Asn Ser Val Gin Arg His He 
250 255 260 

Gly His Ala Asn Leu Thr Phe Glu Gin Leu Arg Ser Leu Met Glu 
265 270 275 

SO Ser Leu Pro Gly Lys Lys Val Gly Ala Glu Asp He Glu Lys Thr 

280 285 290 

He Lys Ala Cys Lys Pro Ser Asp Gin He Leu Lys Leu Leu Ser 
295 300 305 
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Leu 


Trp 


Arg 


He 


Lys 


Asn 


Gly 


Asp 


Gin 


Asp 


Thr 


Leu 


Lys 


Gly Leu 


310 










315 










320 








Met 


His 


Ala 


Leu 


Lys 


His 


Ser 


Lys 


Thr 


Tyr 


His 


Phe 


Pro 


Lys Thr 


325 










330 










335 








val 


Thr 


Gin 


Ser 


Leu 


Lys 


Lys 


Thr 


He 


Arg 


Phe 


Leu 


His 


Ser Phe 


340 










345 










350 








Thr 


Met 


Tyr 


Lys 


Leu 


Tyr 


Gin 


Lys 


Leu 


Phe 


Leu 


Glu 


Met 


He Gly 


355 










360 










365 






Asn 


Gin 


Val 


Gin 


Ser 


Val 


Lys 


He 


Ser 


Cys 


Leu 








370 










375 










380 
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(2) INFORMATION FOR SEQ ID N0:4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1206 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:4: 



ATGAACAACT 
CAGGAAACGT 
TGTGACAAAT 
GTGTGCGCCC 
CTATACTGCA 
CACAACCGCG 
CATAGGAGCT 
GTTTGCAAAA 
AGAAAACACA 
CACGACAACA 
CTGTGTGAGG 
AGTGTCTTGG 
AAACGGCAAC 
AACAAAGACC 
GTGCAGCGGC 
AGCTTACCGG 
CCCAGTGACC 
ACCTTGAAGG 
GTCACTCAGA 
TATCAGAAGT 
TTATAA 



TGCTGTGCTG 
TTCCTCCAAA 
GTCCTCCTGG 
CTTGCCCTGA 
GCCCCGTGTG 
TGTGCGAATG 
GCCCTCCTGG 
GATGTCCAGA 
CAAATTGCAG 
TATGTTCCGG 
AGGCATTCTT 
TAGACAATTT 
ACAGCTCACA 
AAGATATAGT 
ACATTGGACA 
GAAAGAAAGT 
AGATCCTGAA 
GCCTAATGCA 
GTCTAAAGAA 
TATTTTTAGA 



CGCGCTCGTG 
GTACCTTCAT 
TACCTACCTA 
CCACTACTAC 
CAAGGAGCTG 
CAAGGAAGGG 
ATTTGGAGTG 
TGGGTTCTTC 
TGTCTTTGGT 
AAACAGTGAA 
CAGGTTTGCT 
GCCTGGCACC 
AGAACAGACT 
CAAGAA6ATC 
TGCTAACCTC 
GGOAGCAGAA 
GCTGCTCAGT 
CGCACTAAAG 
GACCATCAGG 
AATGATAGGT 



TTTCTGGACA 
TATGACGAAG 
AAACAACACT 
ACAGACAGCT 
CAGTACGTCA 
CGCTACCTTG 
GTGCAAGCTG 
TCAAATGAGA 
CTCCTGCTAA 
TCAACTCAAA 
GTTCCTACAA 
AAAGTAAACG 
TTCCAGCTGC 
ATCCAAGATA 
ACCTTCGAGC 
GACATTGAAA 
TTGTGGCGAA 
CACTCAAAGA 
TTCCTTCACA 
AACCAGGTCC 



TCTCCATTAA 
AAACCTCTCA 
GTACAGCAAA 
GGCACACCAG 
AGCAGGAGTG 
AGATAGAGTT 
GAACCCCAGA 
CGTCATCTAA 
CTCAGAAAGG 
AATGTGGAAT 
AGTTTACGCC 
CAGAGAGTGT 
TGAAGTTATG 
TTGACCTCTG 
AGCTTCGTAG 
AAACAATAAA 
TAAAAAATGG 
CGTACCACTT 
GCTTCACAAT 
AATCAGTAAA 



GTGGACCACC 
TCAGCTGTTG 
GTGGAAGACC 
TGACGAGTGT 
CAATCGCACC 
CTGCTTGAAA 
GCGAAATACA 
AGCACCCTGT 
AAATGCAACA 
AGATGTTACC 
TAACTGGCTT 
AGAGAGGATA 
GAAACATCAA 
TGAAAACAGC 
CTTGATGGAA 
GGCATGCAAA 
CGACCAAQAC 
TCCCAAAACT 
GTACAAATTG 
AATAAGCTGC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1206 
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Claims 

50 1 . A DNA conoprising the nucleotide sequences of the Sequences No. 1 and No. 2 in the Sequence Table. 

2. The DNA according to claim 1 , wherein the Sequence ID No. 1 includes the first exon of the OGIF gene and the 
Sequence ID No. 2 includes the second, third, fourth, and fifth exons. 

55 3. A protein exhibiting the activity of inhibiting differentiation and/or maturation of osteoclasts and having the following 
physicochemical characteristics. 



(a) molecular weight (SDS-PAGE): 
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(i) Under reducing conditions: about 60 kD, 

(li) Under non-reducing conditions: about 60 kD and about 120 kD; 

(b) amino acid sequence: 

includes an amino acid sequence of the Sequence ID No. 3 in the Sequence Table, 

(c) affinity: 

exhibits affinity to a cation exchanger and heparin, and 

(d) heat stability: 

(I) the osteoclastogenesis-inhibitory activity is reduced when treated with heat at TC'C for 10 minutes or at 
56°C for 30 minutes. 

(ii) the osteoclastogenesis-inhibitory activity is lost when treated with heat at SO^'C for 10 minutes. 

A process for producing a protein exhibiting an activity of inhibiting differentiation and/or maturation of osteoclasts 
and having the following physicochemical characteristics. 

(a) molecular weight (SDS-PAGE): 

(i) Under reducing conditions: aboxA 60 kD, 

(ii) Under non-reducing conditions: about 60 kD and about 120 kD; 

(b) amino acid sequence: 

Includes an amino acid sequence of the Sequence ID No. 3 of the Sequence Table, 

(c) affinity: 

exhibits affinity to a cation exchanger and heparin, and 

(d) heat stability: 

(I) the osteoclastogenesis-inhibitory activity is reduced when treated with heat at 70°C for 10 minutes or at 
56**C for 30 minutes, 

(ii) the osteoclastogenesis-inhibitory activity is lost when treated with heat at 90**C for 10 minutes, 

the process comprising inserting a DNA including the nucleotide sequences of the sequences No. 1 and No. 2 in 
the Sequence Table into an expression vector, producing a vector capable of expressing a protein having the 
above-mentioned physicochemical characteristics and exhibiting the activity of inhibiting differentiation and/or mat- 
uration of osteoclasts, and producing this protein by a genetic engineering technique. 
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Figure 1 
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